Follow BigDATAwire:

June 29, 2016

Government Agencies Consider Data Lakes in the Cloud

Tim Bryant

Success in many industries has become dependent on the ability to turn disruptive ideas into value, and Big Data is enabling companies to innovate continuously and at a faster pace than ever before. With a host of current initiatives designed to help government agencies develop computing technologies for extracting knowledge and insight from large collections of digital data, the public sector is jumping on board.

Government entities collect and produce a great deal of information spanning many types and formats, including census data, tax information, video surveillance files, and geospatial intelligence. In order to use it for detecting patterns, analyzing trends, and producing valuable intelligence on the current and future state of the world, these data volumes must be ingested, analyzed, and stored securely and cost-effectively for the long-term.

Recent projections from immixGroup found that total spending on Big Data resources among federal agencies could reach $3.9 billion by 2017 and $4.2 billion by 2018. A substantial percentage of this spending will be on simple data collection, cleansing, and aggregation activities. As a result, many agencies are choosing to implement data lakes to streamline the ingestion process, eliminate data siloes, and seamlessly combine data from many sources in a single Hadoop-based storage repository.

Data lakes make it possible to store various types of structured and unstructured data in one place, allowing data to remain in its native format to increase its accessibility for analytical purposes. The ability to gain a complete view of all data in its original condition enables researchers to quickly pinpoint patterns and extract valuable insight which can be used to improve everything from government operations, to public safety, to intelligence initiatives.

The persistent growth of data is prompting agencies to increasingly consider hosted options for their data lake environments, integrating on-premise storage technologies with cloud services to realize new efficiencies and lower costs. A managed approach can offer a variety of benefits for data management in the public sector:

  • The massive scalability of cloud environments can simplify the data ingestion and integration process and enable real-time, data-driven decision-making.
  • The data security concerns of government agencies can be addressed through a comprehensive set of security controls meant to safeguard sensitive data and ensure compliance.
  • Limitless amounts of data can be stored reliably and cost-effectively for as long as it’s needed, and managed using a single unified platform.
  • Hosted offerings can streamline the process of collecting and processing raw, unstructured datasets flowing in from social, mobile, or external sources.

Enterprise-grade Hadoop software backed by a hardware infrastructure purpose-built for the demands of Big Data can transform today’s federal entities into data-driven epicenters of intelligence. As data lakes progressively become an essential storage technique for managing massive amounts of data, more government agencies will consider moving their data lake to hosted environments to realize massive scalability and eliminate management complexity.

New technologies are extending the possibilities of government data while enabling agencies to store their data volumes in a more efficient and cost-effective way. The ability to harness data’s potential to deliver insightful information will drive greater value not only for agencies themselves, but for the constituents that they serve.

BigDATAwire