big data
⇒ Managing Documents in MarkLogic with XProc
### Summary
⇒ Experiments with Big Weather Data in MarkLogic - Doomed Approach
The [ “Naive Approach” ](http://www.milowski.com/journal/entry/2012-04-11T11:08:29.62-07:00/) of just importing the weather reports verbatim works if all you want to do is enumerate a particular weather report's data by segments of time. That is, this expression works really well:
⇒ Experiments with Big Weather Data in MarkLogic - Right-sizing and Indexing
A lot has happened since my last update on my “big data” weather experiment with [MarkLogic](http://www.marklogic.com/) . I've been through a server crash, low memory trouble, reloaded my database, calculated the actual server requirements, and migrated to a new server. In summary: “Whew! That was a lot of work and hair pulling.”
⇒ Too Much Data and Too Little Memory
Since my last update, I've made a lot of good progress on my “big weather data” project. The goal was always to understand how to organize scientific sensor data like weather reports within a database system like MarkLogic. Alas, I don't currently have access to the hardware to really produce a production quality system that actually stores terabytes of information. I did want to see how far I could get and what the characteristics of cluster would be to store such large-scale information.
⇒ Experiments with Big Weather Data in MarkLogic - The Naive Approach
I've heard over-and-over that [MarkLogic](http://www.marklogic.com/) is a fantastic XML database--you just import your documents and query away! Given the quality of the people that I personally know at MarkLogic, I'm sure that's true. Still, I wanted to put that to the test. Every database system has techniques for getting reasonable or “blindingly fast” performance and I wanted to see how that works and at what cost.
⇒ Experiments with Big Weather Data in MarkLogic - Introduction
Over the past couple months, I've been experimenting with “big data” on the web for scientific purposes. The goal is to take my research on geospatial scientific data on the web and use [MarkLogic](http://www.marklogic.com/) to create a repository for large sensor data. My current scientific area of focus is weather data (sensor data in general) that I'm collecting through the [Citizen Weather Observation Program (CWOP)](http://www.wxqa.com/) .