The Open Web Platform (OWP) is a platform for innovation, consolidation and cost efficiencies focused on those things happen within or intersect the actions of the Web browser.
And so why doesn't this work for Science on the Web? Or does it?
We want to lower the bar for publishing scientific data on the Web so that we enable the network effect while still retaining some aspect of semantics and interoperability.
~500K datasets from data.gov, May 8th, 2012
LittleProblems
Too big: Data sets are typically too large to be processed by the typical Open Web Platform (OWP) implementation as one large Web resource.
Too dumb: HTML table markup lacks the constructs to convey all the information coded within typical tabular data sets.
Too forgetful:
Accessing data may require formulating complex queries or URIs which is error prone. Users can request too much
data which results in failures or requires
paging results.
Where's the smart with the brick house?
Or was he dinner?
Simple is good but questions remain:
<table typeof="Table"> <thead> <tr> ... <th property="column" typeof="Column"> <span property="title">Temperature</span> <span property="property" resource="w:airTemperature"/> <span property="valueSpace" typeof="ValueDescription"> (°<span property="symbol">C</span>) <span property="datatype" resource="xsd:double"/> <span property="quantity" resource="quantity:ThermodynamicTemperature"/> <span property="unit" resource="unit:DegreeCelsius"/> </span> </th> ... </tr> </thead> <tbody> <tr> ... <td>22.2</td> ... </tr>
<a typeof="Partition" rel="nearby" href="http://www.mesonet.info/data/q/5/n/767/2014-02-12T06:00:00Z"> 767 <span property="range" typeof="FacetPartiton"> <span property="facet" resource="/data/#latitude"/> <span property="facet" resource="/data/#longitude"/> <span property="shape" typeof="schema:GeoShape"> <span property="schema:box" content="40 -130 35 -130 35 -125 40 -125"/> </span> </span> <span property="range" typeof="FacetPartition"> <span property="facet" resource="/data/#receivedTime">Received</span> <span property="valueType" resource="xsd:dateTime"/> from <span property="start">2014-02-12T06:00:00Z</span> to <span property="end">2014-02-12T06:30:00Z</span> (<span property="length">PT30M</span>) </span> </a>
DW3904>APRS,TCPXX*,qAX,CWOP:@090158z5132.18N/00043.53W_061/000g001t030r000p000P000h87b10389L000.DsVP CW1604>APRS,TCPXX*,qAX,CWOP:@090158z4444.70N/06531.17W_204/004g009t027r000p000P000h80b10204.DsVP DW6741>APRS,TCPXX*,qAX,CWOP:@090158z3749.55N/08000.08W_296/005g...t036r...p...P008h74b10188.DsVP DW6916>APRS,TCPXX*,qAX,CWOP:@090158z4310.23N/10818.40W_238/001g002t027r000p000P000h58b10189.DsVP DW6011>APRS,TCPXX*,qAX,CWOP:@090158z4307.07N/08756.60W_261/002g006t028r000p000P000h55b10249.DsVP
PAN-enabled Web Resources are:
The data is not:
The data is annotated with RDFa.
For geospatial data, partitioning provides a good baseline for algorithms.
Find a table of data:
// (1) Find the element that holds the partition var datasets = document.getElementsByType("pan:Partition"); // (2) Use the subject to find the partition's item subjects var items = document.data.getValues(datasets[0].data.id,"pan:item"); // (3) Access the first item (a table) var table = document.getElementsBySubject(items[0])[0];
Find a column:
var columns = document.data.getValues(table.data.id,"pan:column"); var column = null; // A variable to hold the subject URI. for (var i=0; !column && i<columns.length; i++) { // Find the column labeled with the air temperature property if (document.data.getValues(columns[i],"pan:property") .indexOf("http://mesonet.info/airTemperature")>=0) { column = columns[i]; } } // Find the index by finding the column element by subject URI. var index = document.getElementsBySubject(column)[0].cellIndex;
The smart pig uses map / reduce!
Query & paging gets you all tied up.
Iterative weighted averages based on observed values:
where
There will be a test later!
It produces the typical colored gradient of surfaces for temperature etc. for using in visualizations (e.g. over maps).
For / °
at for minutes
coloring by range (°C, °C)
with quadrangle size °.
One cannot underestimate the value of
view sourcein the development of the Web.
We want to extend this to both scientific data:
copy and modifymodel
go viralon the Web
What will the hacker in the corner
will do with scientific data?