MarkLogic
⇒ Disk Soup: AWS, EBS, RAID, MarkLogic, and Pinch of Salt!
At the [2013 MarkLogic User Conference](http://www.marklogic.com/events/marklogic-world-2013/) , I learned all kinds of interesting and valuable information about running [MarkLogic](http://www.marklogic.com/) on [AWS (Amazon Web Services) EC2 servers.](http://aws.amazon.com/) Most specifically, it was mentioned that I wasn't necessarily going to get a huge performance gain over regular EBS storage via the RAID 10 configuration that I cooked up. That was good news to me because it costs me quite a bit to have all that extra EBS storage for RAID10.
⇒ Managing Documents in MarkLogic with XProc
### Summary
⇒ Experiments with Big Weather Data in MarkLogic - Doomed Approach
The [ “Naive Approach” ](http://www.milowski.com/journal/entry/2012-04-11T11:08:29.62-07:00/) of just importing the weather reports verbatim works if all you want to do is enumerate a particular weather report's data by segments of time. That is, this expression works really well:
⇒ Do Elements have URIs?
I was discussing a problem with triples generated from RDFa and the in-browser applications I have developed using [Green Turtle](https://github.com/alexmilowski/green-turtle) with a learned colleague of mine whose opinions I value greatly. In short, I wanted to duplicate the kinds of processing I'm doing in the browser so I can run it through XProc and do more complicated processing of the documents. Yet, I rely on the *origin of triples* in the document for my application to work.
⇒ Experiments with Big Weather Data in MarkLogic - Right-sizing and Indexing
A lot has happened since my last update on my “big data” weather experiment with [MarkLogic](http://www.marklogic.com/) . I've been through a server crash, low memory trouble, reloaded my database, calculated the actual server requirements, and migrated to a new server. In summary: “Whew! That was a lot of work and hair pulling.”
⇒ Too Much Data and Too Little Memory
Since my last update, I've made a lot of good progress on my “big weather data” project. The goal was always to understand how to organize scientific sensor data like weather reports within a database system like MarkLogic. Alas, I don't currently have access to the hardware to really produce a production quality system that actually stores terabytes of information. I did want to see how far I could get and what the characteristics of cluster would be to store such large-scale information.
⇒ Experiments with Big Weather Data in MarkLogic - The Naive Approach
I've heard over-and-over that [MarkLogic](http://www.marklogic.com/) is a fantastic XML database--you just import your documents and query away! Given the quality of the people that I personally know at MarkLogic, I'm sure that's true. Still, I wanted to put that to the test. Every database system has techniques for getting reasonable or “blindingly fast” performance and I wanted to see how that works and at what cost.
⇒ Experiments with Big Weather Data in MarkLogic - Introduction
Over the past couple months, I've been experimenting with “big data” on the web for scientific purposes. The goal is to take my research on geospatial scientific data on the web and use [MarkLogic](http://www.marklogic.com/) to create a repository for large sensor data. My current scientific area of focus is weather data (sensor data in general) that I'm collecting through the [Citizen Weather Observation Program (CWOP)](http://www.wxqa.com/) .
⇒ XProc on My Website
I've migrated my whole website to run on a combination of [XProc](http://www.w3.org/TR/XProc) , [Restlet](http://www.restlet.org) , and the new [Atomojo V2](http://code.google.com/p/atomojo) server. Atomojo V2 provides an Atom APP backend powered by XProc and [MarkLogic](http://www.marklogic.com) glued together using Restlet. The same archictecture has been use to deploy this website. That is, almost all the pages are the result of running some XProc-enabled process.
⇒ Disk Space is Important!
Sometimes you learn interesting things under duress, reaffirm things you already know, and pay for not doing it right the first time.