Internet Musings

Generic Product, how does it feel to be both NEW! and IMPROVED!? And, as a follow up, how do you management to pull off 20% more FREE!

Well, random internet user, we accomplish this by being very NEW and IMPROVED in all aspects of our business. Customers like you allow us to offer 20% more FREE. Thank you for the question.

applause.

Something from the internet.

Mamma Dee’s Garden Members Meeting 5/19/13

Last Sunday we had our first official Mamma Dee’s Community Garden members meeting. While it rained outside everyone got cozy in the local Juice Hugger shop (owned and operated by two of our members).

We introduced ourselves and gave short report backs about what we were working on in the garden. Some highlights:

  • Rogers Ave Association is planning a block party on Rogers Ave in late July or early August. There was general agreement that it would be good to participate.
  • There was some re-arranging of plots leaving plot P open. Check out the garden map to see where the plots are.
  • We are using the yellow tool box to post important Member information in. Drew also built a bulletin board on the gate, if anyone has ideas to make it more useful, go for it!
  • We picked up plant donations from Kingsboro (cabbage, eggplant, kale–lots of stuff), it’s all up for grabs right when you walk in the garden (in the plastic starter holders)
  • Remember to turn off the water!
  • You can buy plant starters from Green Markets with EBT, see sign on the garden gate.
  • Picked up 90 tulips from Brooklyn Botanical Gardens

Agenda Items:

Weekly Hours

Garden season is beginning of April until end of October. Greenthumb asks that we post our open hours and stay open for a minimum of 20 hours a week. Leanna and Pam collected people’s time/day preferences and agreed to turn it  into a calendar. The calendar will be posted in the garden and members will sign up for shifts.

Taking Care of Garden Tasks

I suggested that we form working groups to take care of tasks and projects around the garden. This way we don’t have to wait for the general meeting each month and spend a lot of time going over the nitty gritty details of projects. We agreed to form working groups around projects and that each working group will have a “bottom liner”. The bottom liner’s role is to coordinate with the working group and be the information clearing house for the group. We spend a few minutes putting together some working groups:

Tulip Group

Bottom liner: Erin
Members: Erin, Gloria, Andrea, Leanna, Leah, Drew

Painting Gazebo Group

Bottom liner: Colleen
Members: Colleen, Rob, Carl, Kelly

The members attending the meeting liked the idea of staining the gazebo rather than painting it.

Children’s Box Group

Bottom liner: Leanna
Members: Leanna, Drew, Pam

The Children’s box is by the gazebo (box S), it’s cute, things have happened in the past there. We didn’t have open hours though, so it was tough. Need to do outreach and get the plot going.

Compost Group

Bottom liner: Rob
Members: Rob, Leah, Colleen, Drew, Pam

Communications Groups

Bottom liner: Drew
Members: Drew, Leah, Devin, Rob, Colleen

Events Group

Bottom liner: Drew
Members: Drew, Colleen, Rob, Devin

I’ve put together a page on the website where we can add and track working groups

New Treasurer

We unanimously elected Leanna as the new garden treasurer. Leanna has a background in accounting with large non-profits so I think we’ll be in good hands.

Plot P

Member Rob has been working on building out the plot directly to the right as you walk into the garden. It was suggested that we use it as a community plot, use anything grown in there for community events like BBQs and generally share the plot. (I’ve already planted snap peas against the fence beyond the plot)

That about does it! It’s been super rainy all week so the plants are sure to be very happy.

Read the full meeting notes here

Taarifa + Data Anywhere, a report back

Last week I was flown out to Birmingham UK for the H2D2 hack-a-thon. I left Brooklyn on the first day to break 80 and landed hours later in a moist cold England, so it goes without saying that I spent a good amount of time in front of my computer.

We started the weekend off on Friday night by going around and chatting about our intentions, projects, and the website we’d choose if we could only view one website ever again (I chose reddit). I introduced Data Anywhere. A gentleman named Mark Iliffe spoke about a project he was involved in called Taarifa. A mapping system build on Ushahidi which is currently deployed in Uganda helping the government there deal with issues citizens report.

They were interested in building an API to handle their workflows (GETing and POSTing reports). I saw an obvious connection to the project and decided to fold in with them. I’m going to talk to the other people who are involved with Data Anywhere and advocate to merge in with this project. It is already established and has a very similar use case to Data Anywhere. Let’s pause to understand what an API is:

An Application Programming Interface (API) is a set of functions, procedures, methods or classes used by computer programs to request services from the operating system, software libraries or any other service providers running on the computer. A computer programmer uses the words in the API to make application programs.

- From simple Wikipeadia

Nico, the technical lead (not his official title) mapped out the same basic tech stack for Taarifa API as we used for Data Anywhere. MongoDB with a Python + Flask API. So it made a lot of sense to work with them.

A little background

Data Anywhere (DA) seeks to be a solution to sharing data sets between organizations, advocates  and the general public in a manner that protects privacy while also opening up as much data as possible. To achieve this, data collecting organizations will host their data on a secure server and place on top of it an web API which will allow authenticated parties to query the data set.

Taarifa does just about the same thing, only it’s focus is more on individuals self reporting where DA’s focus was more on organizations collecting the info (eg: canvasing).

Both solutions need to answer a very important question:

Which data standard do we use?

This is a complex question. To break it down a bit we can think of the data standard as the web form that one fills out to commit data to the system. The information the form accepts is based on the data standard. One good data standard is open 311 which sets a standard for 311 style data, like reporting a downed tree or smelly neighbor. In Taarifa’s case, this works very well. Currently in Uganda Taarifa is collecting data that would fit well into Open 311′s standard but this standard wouldn’t ask the right questions of people after a disaster (no questions about mold).

Obviously a good system will need to be able to accept data from many different standards. This is precisely what we designed our API to do. Each data set will be tied to a service type (called a protocol type).

Here’s an example API call to a URL: https://api.taarifa.org/v2/[api key]/services.json?protocol_type=Open311

The system will host multiple service types. It should even be able to maintain custom services, if a user likes Open 311 but needs more form fields then they can “fork” the Open 311 service and create their own, which others can then use.

I get the feeling that my non-nerd readers might be beginning to gloss over about now. The point is that the system can take in multiple kinds of structured data. The structured part of the data is very important because it allows us, as a world wide community, to build tools that can work with sets of data we have never seen. If I create a mapping application that reads Open 311 data then it doesn’t matter who’s data gets passed to me, as long as it matches the Open 311 standard that way my app can display the data in a meaningful way.

More information about the API design can be found on this HackPad.

The Basic Workflow

Lets say you want to collect data on the rent prices in your neighborhood. You would setup an instance of Taarifa on your server. Then select from a library of pre-build services (or protocols) if you can’t find one that exactly fits your need you could choose one that’s close and “fork” it, or make a copy and modify an existing service. You’re new services is then added to the community library of servers so that the next person who wants to collect rent prices doesn’t have to start from scratch.

taarifa_workflowNow you go an collect your data. This can be done with paper forms which are later input into the system or you can direct people to a web form (or a phone app, or anything). You choose which information is private (names, phone number, etc.) and what is public (rent price, street, etc.).

At this point your data is available for consumption via RESTful API from your server. We want to create a method for servers to network so that your data can be shared to other Taarifa nodes. We can encrypt the private data  so that it can be stored on other nodes but not read by anyone other than the people with the encryption key.

Why?

The web is moving away from single use web “sites” and toward sets of structured data called the semantic web. The data you collect should be owned by you and not rely on someone else’s server and web application. This strategy makes it possible for any number of applications to read, display, and even write data to your data set.

Furthermore, because you’ve created a custom service type for your data set other people can independently set up their own server, use your service (protocol) and begin to collect their own data which can be read by applications already configured to read your data.

After Hurricane Sandy hit New York City there were many organizations running around canvasing survivors. Yet if you search for a standard disaster canvasing form you’ll have a hard time finding it. Because there are no standards each organization collected radically different data sets, even though they were all asking relatively the same questions. For instance one form might ask “does your home have heat?” while another might ask “Is your home without heat”. Both questions are asking basically the same thing, but the data they produce is different. If a standard disaster canvas form had been created then each group could have used that, collected data, and shared the standardized data.

Final thoughts

What I’ve outlined above is fairly pie-in-the-sky but I think that it is quite possible. I’m very happy to begin working with the Taarifa team to make this a reality. I’ll be sending this post to them for review to make sure I’m not totally off base with their objectives. If this sounds interesting to you feel free to shoot me an e-mail and I’ll get you involved.

hello@dhornbein.com

 

Working on the May First Control Panel

I’ve been slowly working on the May First/People Link server admin control panel for the past few months.

I made a big push today. I upgraded the css framework to Foundation 4, a remarkable mobile first framework. I’m using SCSS to develop the style sheets, which is really slick. SCSS allows me to do some programming like magic within CSS, like variables and functions. The whole thing is built on “red” which is one of the best server control panels I’ve ever used. While it’s not terribly newbie friendly it is great for a mid range server person like myself. It doesn’t treat me like an asshole, which is nice. More documentation about red can be found in the support documents at MayFirst.org

Let’s just look at some screen shots:

Here’s what the panel looks like currently:

I think the main improvement is the

side navigation and…

huge add new item button. Now no one will wonder how to add a new item.

My next step will to build out some documentation on the mayfirst Trac and get this puppy in a place where it can be collaboratively worked on.

Racism, brought to you by the Dr Pepper Snapple Group

I just watched some problematic shit on the internet. Amazing, I know. In a post titled Dirty Laundry, the greatest justice video i have seen with 2126 points (the product of upvotes minus downvotes)on reddit.com.

It’s odd how much my environment has changed. I watched this and felt really uncomfortable. I went into the reddit comments to find some sanity. People commenting in a sober fashion about the underlying racism, the use of trite stereotypes, and analysis of a corporate sponsored racist snuff film.

Here’s the top comment:
Best comment on the video: "Brittle bone ass niggas need to drink their milk"Not what I was looking for. I don’t want to demonize the, I’m going to assume, white man who wrote this. I simply don’t think this is video is okay (or the comments made about it). I have no idea what it means to be oppressed, but this film and the reaction of the people on reddit, don’t give me high hopes that people are going to be the least bit sensitive about how we as a society treat the festering wound of the systematic oppression of black men, among many oppressed people.

Right now, not 20 minutes away from where I live folks have been out on the streets protesting the NYPD’s shooting of Kimani Gray a 16 year old in Flatbush. Meanwhile slavery is still going on in the states.

This isn’t a rant about this movie or the mindless comments. If I were a 16 year old white kid, not afraid of being shot by the police, in a place where racism is tolerated I would think this video is pretty fucking sweet. Blind to the social poison just under the surface.

We have a long way to go. Compassion for the fools who haven’t gotten a clue…

Video  after the jump:

Data Anywhere – distributed data storage and sharing solution

At this weekend’s #OccupyDataNYC hack I worked on the Data Anywhere project presented by Gloria W of NYC Python Meetup. This project seems to have a lot of promise for solving the data management issues plaguing #OccupySandy and other relief organizations.

The Current Problem

an image of two humanoid figures standing together. One speaks while the other takes notes on a canvassing form.

Much of the data that is being collected is from canvassing.

Relief organizations canvass residents to better understand what a community needs. This data, normally collected on paper, then needs to be digitized. Once digitized, it then needs to be stored in a secure way. It also must remain available for review by the organization that “owns” the data, as well as any organizations working in coordination.

  • Organizations collecting data need a secure place to keep it.
  • Non-private data needs to be available for research and advocacy.
  • Private data needs to be available–in a secure way–to people who can act on it, even if they are not part of the organization that originally collected the data.

The existing solutions tend to be based on software that holds data along with doing a whole host of other great things like case management, mass emailing, etc. These solutions, however, are normally limited to members of the organization; advocacy groups will have a hard time gaining access to the system to run reports to further common causes. If two groups want to share data they have to agree to use one piece of software which can be very difficult if each group has invested time and money in their particular solution.

Once data is put into a locked system, it is safe–but, it can be almost impossible for even the most effective spontaneous grassroots organizations (like Occupy Sandy) to gain access to it to make it actionable.

Real-Life Use Case

The Staten Island Community and Interfaith Long Term Recovery Group (LTRG), which came together in Staten Island after Sandy, turned out volunteers to canvass over 1000 homes this January. The paper forms that were used then had to be digitized. With the help of Occupy Sandy volunteers, we were able to enter all this data a few weeks later.

A stack of canvas forms looms over our hero, the data entry volunteer. Beyond them are more volunteers in the background diligently turning paper into. The volunteer in the foreground discharges one comically large drop of sweat. Under such pressure who wouldn't

Taking the inconsistent form data, trying to decipher handwriting, and accurately entering that into a computer is no easy task.

This data was digitized via a Google Form that mirrored the questions on the canvass sheet. The data now lives on a Google Drive folder, where access is all or nothing.

In February, a heavy snowstorm loomed and an organizer with the LTRG asked me to create a report that could be shared with other relief organizations to check up on people who indicated that they were living without heat—a question on the canvass form. I hacked together a solution by creating a pivot chart in the Google Spreadsheet then exported a PDF with the names, addresses, and phone numbers of people who indicated they were living without heat. Volunteers were dispatched to deliver warming information and heaters to residents on the list.

It took weeks to even begin to make this data actionable; if new data had been added or entries had needed to be updated, there wouldn’t have been a clear way to do this. If you don’t happen to know the people who have access to the data you can’t even hope to use it for good. Non-personal data gathered from the canvass is equally inaccessible, hampering data nerds across the world from contributing to advocacy.

  • Data is hard, if not impossible, to access even for those who “own” the data.
  • Data is often stored on third party servers and access can be revoked without notice (do YOU know what Google’s Terms of Service are?).
  • Data is often not secure and access control is all or nothing.
  • Producing reports requires a good bit of work.

What Would a Good Solution Look Like?

I want to be able to ask a database “who in this region doesn’t have heat?” and be presented with a list of people who indicated to a canvasser that they didn’t have heat. Furthermore, I might want to cross-reference all homeowners who have flood damage and don’t have insurance. I want to share queries with collaborators as well as make non-private data available to the public. I want to pull data into my current workflow without having to learn new software!

Ideally:

  • Data would have a persistent home.
  • Data would be machine readable.
  • Data stewards could manage access to the data, making some public and keeping some private.
  • Reports could be generated on the fly.
  • Third party applications could query the data.

Persistent Home

Currently data is on random computers, either in the cloud or at some person’s possession—if it’s even in a digital format at all. Having a persistent home where the data and subsequent updates can be housed makes finding the most up-to-date data easier.

03_sharing_v2

Behold the above diagram. At the top is freshly digitized data. It now has to be shared, manually, with collaborators who can then further share the data. This isn’t ideal because either the data ends up in the wrong hands (the magenta jerk) or the data becomes unreliable (bottom right).

Think about working on a shared Excel spreadsheet that gets e-mailed around: it quickly becomes hard to know which version is the most current or accurate.

Machine Readable

Data that fits a spec and is standardized so that it can be read by a machine means that powerful scripts can be executed against the data to perform all kinds of magic, from plotting points on a map to cross-referencing common data elements across data sets.

Access Management

It’s important to keep people’s private information secure while also allowing non-personal data to be available to the community. The data stewards need granular access management: some people need to see everything while others need only see some things. A good solution would allow for multiple levels of access to private data while allowing most data to be open to the public, if the stewards so desire.

Reports

The data isn’t of much use if you can’t query it and produce reports. Furthermore, reports shouldn’t have to be re-created every time the data is updated or changed slightly.

Third Parties

Within the Sandy response, many web and mobile applications sprung up to manage information and data. Occupy Sandy set up an instance of the open-source disaster relief software Sahana while another team set up Disaster Dispatch to manage relief efforts in Staten Island. Both of these web applications should be able to access our canvassing data. While Sahana is an exceptional tool, we can’t force other groups to use our software by locking our data up in the system. So our solution must be agnostic to any software platform.

The bottom line is that people don’t want to learn another program or create another user account on some web app, especially in disaster response or recovery.

While thinking about this I figured that the answer was to build and Application programming interface (API) around the data so that authorized users could query the data as well as posting new data or updates. An API is simply a strategy to get humans interfacing with technology. The screen you are staring at, the buttons on an ATM, and your computer mouse are all APIs.

There is a particular kind of API that I was envisioning, called a RESTful API. I’ll save the technical details for another blog post, but the basic idea is this: the data steward keeps data on a web server and provides a way for authenticated people to query the raw data from the server.

It turns out that is exactly what Data Anywhere aims to do.

Data Anywhere

Data Anywhere is not an application, but a strategy for setting up off-the-shelf VPS (Virtual Private Server, i.e. a cheap web servers) to host data and a custom API for querying that data. Here’s how it works:

  1. Data is collected and digitized.
  2. The digitized data is then mapped to a data structure and uploaded to a database on a configured server.
  3. An API is developed to query the data.
  4. Users with proper authentication can then access raw data simply by visiting a website.
  5. Third party developers can build apps that run on this data.

I’ll get into the more technical stuff later in this post. For now let’s think of it this way:

Your organization uses a proprietary system for managing canvassing data. You really want to share this data, but can’t have CSV files floating around and don’t have the time to create custom reports every time someone needs one.

If you had a Data Anywhere server you could drop your data set in there, then allow people to access specific areas of your data. If you wanted to share data with someone you would simply have to grant them access to the Data Anywhere server and they could begin asking the server, not you, for reports on your data.

Here’s a rough flow chart of the process:

dataanywhere_workflow_draf1

 

  1. Data is collected with a form based on community needs and collective data standards.
  2. Data is entered into a computer, digitized. If information is collected digitally this step is simplified.
  3. Data is mapped to a standardized format for storage. Structuring input data to match standards will simplify this process.
  4. Data is stored in an off-the-shelf server.
  5. A data API is developed to manipulate stored data. Data stewards control how data is made available. By following collective standards uniform data can be distributed to a network of data servers.
  6. App authors can access shared data from the server network to run services.
  7. Apps can interface with other systems both open and closed.

Allow me to get a little more technical. My non-techie readers can skip this section.

Under the Hood

We currently have an example set up using my fork of the Data Anywhere code on GitHub. The server is an off-the-shelf VPS running Linux Fedora. The stack consists of:

  • Nginx
  • Python with Flask
  • MongoDB
  • You can see the bash history here of the full server setup.

We use Python to parse raw data from a .CSV file. The data is then stored in the MongoDB. A RESTful API is written using Python and Flask to call data from the database. Data can now be queried from the server! Private methods can be created and put behind HTTP authentication.

Why is this a good solution?

Data Anywhere allows organizations to keep data in a locally-controlled database, which can then be shared without any work on the part of the organization (aside from the initial set up).

It also allows for very granular access. You can set up many levels of access and control just how much data any of those levels has access to.

The solution is agnostic to existing systems. Because the API produces machine-readable code, existing software can interface with the data rather than other software based solution. The relief community then doesn’t have to decide on a single piece of software. Groups that aren’t heavy users of software could very well access data in formats suited for download and printing.

Anyone in the world can develop applications that use the data. I’ve provided a link to one I made below.

What are the shortfalls?

There is a lot of work that has to be done on the front end. Data must be mapped and merged, and then APIs must be written from scratch. These shortcomings can be overcome as more groups use the system libraries and naming standards, and unified APIs can be developed.

As the system becomes more automated with more developers adding to the project, the challenge shifts to one of creating standards—which, in my humble opinion, is a much better challenge to have: it is non-technical and so more people can work on it.

Lets see it in action!

Just to recap, data (in this case canvassing data of resident needs) is uploaded into a server. The server can then be queried. Here’s an example:

This is a URL that prints data

That URL is asking the server for all the resident data who indicated that they do not have heat and need medical attention. The server responds with data in a format called JSON, which looks like this:

{
      "timestamp": "2013-02-16T14:49:07Z", 
      "date": null, 
      "home": {
        "occupant-count": null, 
        "have-disabled": null, 
        "residence-other": "Family/Friend", 
        "have-seniors": false, 
        "have-children": true, 
        "resident": false, 
        "rent-or-own-1": null, 
        "rent-or-own-2": "OWN", 
        "damage": {
          "need-help-repair": true, 
          "house-has-mold": true, 
          "house-has-damage": true
        }
      }, 
      "id": "51328f9b2ceea80e0930fd2d", 
      "sandy": {
        "resident-during-sandy": false
      }, 
      "notes": {
        "note-insurance": "got $11,000 from insurance. and was told by FEMA that it was adequate?? electricity repair cost was $5,500.  needs help with dealing with mortgage company.", 
        "note-contact": null, 
        "note-housing": null, 
        "note-other": "can't move back in without floors.  has plywood to replace subfloor.  send grants list.", 
        "note-info": "currently on medical leave from Board of Education, needs food deliveries. rapid repairs screwed up heating so it's getting fixed on 2/18/13.  rapid repairs flooded attic.  evacuated when water reached knees.  has spots of mold on subfloors.", 
        "note-fema-sba": "might be eligible for unemployment because of stroke.  denied FEMA because didn't have inspection.  denied by SBA."
      }, 
      "contact": {
        "contact-origin": null, 
        "contact-lang": null, 
        "phone": null, 
        "first-name": null, 
        "email": null, 
        "last-name": null
      }, 
      "other": {
        "need-medical": true, 
        "house-rental": false, 
        "need-fema-appeal-lawyer": true, 
        "note-need-greatest": "Repairs", 
        "need-food": false, 
        "need-lawyer-2": true, 
        "need-rental-assistance-fema": true, 
        "????": "Completed", 
        "register-fema-denied": false, 
        "need-loan-lawyer": true, 
        "have-insurance-flood": false, 
        "need-shelter": false, 
        "have-heat": false, 
        "have-seriors": false, 
        "have-mortage-problem": true, 
        "have-electricity": true, 
        "filed-insurance-claim": "Yes", 
        "have-payment-fema": true, 
        "have-food-stamps": false, 
        "have-water": true, 
        "have-water-potable": true, 
        "have-insurance-flood-payment-ok": false, 
        "need-help-unemployment": true, 
        "applied-sba": true, 
        "have-house-sticker": "None", 
        "have-stress": true, 
        "have-payment-fema-rental": "received original $2900 but put into repairs", 
        "note-need-medical": "both asthmatic, difficult breathing because of mold.  husband goes to house to feed animals.  she needs medical attention bus is taking care of health.", 
        "have-plumbing": true, 
        "have-insurance-flood-payment-appeal": false, 
        "contact-insurer": "State Farm", 
        "need-food-nonperishable": true, 
        "have-payment-fema-ok": false, 
        "registered-fema": true, 
        "need-insurance-flood-payment-lawyer": true, 
        "interested-attend-meeting": true
      }, 
      "location": {
        "city": "Staten Island", 
        "state": "NY", 
        "street": null, 
        "zip": "10305"
      }, 
      "project-zone": null, 
      "gid": "SI1_1108"
    }

Yes I know, that looks very very scary. I wanted to show the raw data first to highlight what that data can become. Last night over the course of about an hour I was able to produce this test page:

That web page is accessing publicly available data from the Data Anywhere server. Now imagine that there are 10 different data servers owned by 10 different groups doing canvasing in NYC. My app could very well query all of them!

I want to drive this point home. Let’s say Restore the Rock collects a thousand canvas forms from the Rockaways. They used a standardized form (with some customization for the unique qualities of the neighborhood) and have digitized the responses. We now set up a Data Anywhere instance on a server which belongs to Restore the Rock. The API strategy is developed and now they are able to serve the data how ever they want. To ease sharing needs, the non-private portions of the data are made public. Anyone in the world can now access this hyper local data. In the above example I, a developer, took that data and made it into a web app. My web app can now be used by volunteers back in the Rockaways to see what supplies are most needed in the neighborhood.

Co-create this solution

I would like to preface this ask for help with the simple fact that I am no expert an any of this. Never designed an API, deployed servers, or written scripts to parse data. So any advice you can give me is very much appreciated. I’ve identified three main areas that need work.

  1. Structuring raw data. We need to take raw data in almost any configuration and map it. Turn “Does your house have heat? NO” into have-heat = False. What’s the best way to do that?
  2. Writing clear, extendable, and flexible APIs. Each set of data is going to be different, but there must be existing standards for most of this stuff. How do we not already have a standard canvassing form?
  3. Automation for easy replication. One click install, though for Phase 1 we can settle for less.

If you are interested in this project, can lend me any advice, or know of events where I could pitch this to interested talented people, please e-mail me at hello@dhornbein.com

Check out the code on GitHub: github.com/dhornbein/DataAnywhere

Or the website here.

Leave questions or comments below!