Data Chat

From Random Hacks of Kindness
Jump to: navigation, search


Contents

[edit] DataChat

[edit] Owner

Proposed by: google.org

Contact (name, email, phone, skype): Pablo Mayrgundter <pmy@google.com>, Steve Hakusa <shakusa@google.com>

Best way and times to contact during RHoK 2.0 Dec 4/5 2010: email

[edit] Summary

Tweak the Tweet has shown some of the potential for exchanging structured information over a public network like Twitter, using the existing practice of #hashtag markup to add crisis-related message annotations, and recommending new conventional use to encode some data types near certain tags.

However, significant problems remain with this approach:

  • requires an existing support community and manual processing
  • minimal/ad-hoc built-in datatype support, e.g. #loc
  • no built-in or well-specified mechanism for storing the encoded/structured information
  • lack of composition: complex record types are difficult or impossible to specify, populate or query
  • high recall, low precision: e.g. #need is intended to specify needs in a crisis situation but in practice also picks up "#need a haircut"
  • lack of built-in authenticity or integrity reporting
  • no use of existing schemas; e.g. could Person Finder's PFIF be used to specify the kinds of fields that are useful for tracking missing people.

An extended language should be R&D'd based on the experiences of TtT to address these outstanding issues.


[edit] Example

Mostly extensions of TtT use cases. A strawman language called Slang is demonstrated:

Slang specification

By example, the following slang messages describe a Red Cross clinic in Haiti and its capacities:

 #ht.redcross // implicit create, with reporting whether exists?
 #ht.redcross.clinic42
 #ht.redcross.clinic42{loc:18.563436,-72.319565}
 #ht.redcross.clinic42{beds:10}
  • Name: a reverse domain name to specify a unique namespace, e.g. #ht.redcross.clinic42
  • Record: an attribute/value object notation inspired by JSON: e.g. {beds:10}
    • attributes and values strings need to be escaped only when the delimiters are also part of the string, e.g. "{time:"10:00pm"}"

Querying

Querying namespaces and objects is supported through the replacement of a specifier with a question mark "?"., e.g.:

 #ht.redcross.?

should return a list of redcross clinics in haiti.

 #ht.redcross.clinic42{beds:?}

should return 10.

Aliases

An entity may be referred to by multiple slang names:

 #ht.redcross
 #org.redcross.haiti

Linking and reverse lookup should be supported to enable different slang communities to discover each other:

 #ht.redcross{alias:org.redcross.haiti}
 #ht.redcross{alias:?}


[edit] User story 1: I need to know bed availability at the clinic a mile away to plan patient referrals

How many beds are currently available at the MSF Delmas clinic?

[edit] User story 2: I need to verify information sources

Who reported the bed count first? Has it been updated? Are they associated with an organization I trust?


[edit] Description and Constraints

Like TtT, messages should be short and effective, providing both data and narrative. Standards should be created by communal iteration towards effective formats.


[edit] Extra Credit

Connect creation, querying and record updating to http://code.google.com/p/datawiki/


[edit] Similar projects and Resources

http://code.google.com/p/haggle/ - including pubsub and "delay tolerant search", shared attributes imply relationships between different objects of between types. Prophit delegate forwarding algorithm.


[edit] What next and Sustainability

How will this work be taken to real users, or further developed? Will this be an ongoing team? Is there a NGO/group that's sponsoring it as an ongoing project? Who/how/when?

[edit] Current State and Solutions

There is some prototype code available here:

http://code.google.com/p/datachatbot/

The code is running at:

http://datachatbot.appspot.com/

Personal tools