Web Blog_blogentry_290417_1


Ran Across Today#

Personal Metadata[1]#

Personal Metadata digital information about users' location, phone call logs, or web-searches – is undoubtedly the oil of modern data-intensive science and of the online economy. This high-dimensional metadata is what allow apps to provide smart services and personalized experiences. From Google's search to Netflix's “movies you should really watch,” from Pandora to Amazon, metadata is used by commercial algorithms to help users become more connected, productive, and entertained. In science, this high-dimensional metadata is already used to quantify the impact of human mobility on malaria or to study the link between social isolation and economic development.

Metadata has however yet to realize its full potential. This data is currently collected and stored by hundreds of different services and companies. Such fragmentation makes the metadata inaccessible to innovative services, researchers, and often even to the individual who generated it in the first place. On the one hand, the lack of access and control of individuals over their metadata is fueling growing concerns. This makes it very hard, if not impossible, for an individual to understand and manage the associated risks. On the other hand, privacy and legal concerns are preventing metadata from being reconciled and made broadly accessible, mainly because of concerns over the risk of re-identification.

Data ownership and privacy[2]#

Perhaps the greatest challenge posed by this new ability to sense the pulse of humanity is creating a "new deal" around questions of privacy and Data Ownership. Many of the network data that are available today are freely offered because the entities that control the data have difficulty extracting value from them.

As we develop new analytical methods, however, this will change. Moreover, not all people who want access to the data do so for altruistic motives, and it is important to consider how to keep the individuals who generate this information safe. Advances in analysis of network data must be approached in tandem with understanding how to create value for the producers and owners of the data while at the same time protecting the public good. Clearly, our notions of privacy and ownership of data need to evolve in order to adapt to these new challenges.

This raises another important question: how do we design institutions to manage the new types of privacy issues that will emerge with these new reality mining capabilities? Digital traces of people are ubiquitously preserved within our private and public organizations— location patterns, financial transactions, public transportation, phone and Internet communications, and so on. Certainly new types of regulatory institutions are required to deal with this information, but what form should they take?

Companies will have a key role in this new deal for privacy and ownership. One suggestion is that there is an incentive system, one that gives added value to the users. Market mechanisms appear to be a particularly interesting avenue of exploration, since they may allow people to give up their data for monetary or service rewards. Ideally, this would be put into place in order to gain approval from the majority of the population to use data collected from their digital interactions.

Other important considerations revolve around data anonymity.The use of anonymous data should be enforced, and analysis at the group level should be preferred over that at the individual level. Robust models of collaboration and data sharing need to be developed; guarding both the privacy of consumers as well as corporations’ legitimate competitive interests are vital here.

What must be avoided is either the retreat into secrecy, so that these data become the exclusive domain of private companies and remain inaccessible to the Common Good, or the development of a “big brother” model, with government using the data but denying the public the ability to investigate or critique its conclusions.

Neither scenario will serve the long-term public interest in having a transparent and efficient government.

The new deal on data#

The first step toward open information markets is to give people ownership of their data. The simplest approach to defining what it means to "own your own data" is to go back to Old English Common Law for the three basic tenets of ownership, which are the rights of possession, use, and disposal:
  • possession: You have a right to possess your data. Companies should adopt the role of a Swiss bank account for your data.You open an account (anonymously, if possible), and you can remove your data whenever you’d like.
  • use: You, the data owner, must have full control over the use of your data. If you’re not happy with the way a company uses your data, you can remove it.All of it. Everything must be opt-in, and not only clearly explained in plain language, but with regular reminders that you have the option to opt out.
  • disposal: You have a right to dispose or distribute your data. If you want to destroy it or remove it and redeploy it elsewhere, it is your call.

Ownership seems to be the minimal guideline for the "new deal on data". There needs to be one more principle, however—which is to adopt policies that encourage the combination of massive amounts of anonymous data to promote the Common Good. Aggregate and anonymous location data can dramatically improve society. Patterns of how people move around can be used for early identification of infectious disease outbreaks, protection of the environment, and public safety. It can also help us measure the effectiveness of various government programs, and improve the transparency and accountability of government and nonprofit organizations.

Web Blog_blogentry_290417_1 and IoT#

In this IoT scenario, billions of devices collect data out in the world and send it back to somebody's cloud for storage and/or processing. That data has value, not only to the company generating it, but to the technology companies that provide the data-crunching services. And as whole notion of "big data" involves aggregating data from many sources, analyzing it, slicing and dicing it, the issue of data Provenance and Data Ownership becomes murkier.[3]

More Information#

There might be more information for this subject on one of the following: ...nobody