Google touts open data cloud to unify information from all sources


Google Cloud has the ambition to create what it says is the most open, scalable and powerful data cloud of all, as part of its mission to ensure that customers can use all their data, from n any source, no matter where they are or in what format it is.

Google announced this “data cloud” vision today at Google Cloud Next 2022, where it introduced a flurry of updates to its existing data services, as well as new ones. The new updates are all designed to deliver on this vision of an open and scalable data cloud.

“Every business is now a big data business,” Gerrit Kazmaier, vice president and general manager of Data Analytics at Google Cloud, told SiliconANGLE in an interview. “It’s a call for a data ecosystem. It will be a key part of modern business.

One of the first steps in realizing this vision is to ensure that customers can effectively use all of their data. To this end, Google’s data warehouse service BigQuery has gained the ability to analyze unstructured streaming data for the first time.

BigQuery can now ingest all types of data, regardless of storage format or environment. Google said this is vital because most teams today can only work with structured data from operational databases and applications such as ServiceNow, Salesforce, Workday, etc.

But unstructured data, such as video from television archives, audio from call centers and radio, paper documents, and more, makes up more than 90% of all information available to organizations today. . This data, which was previously left in the dust, can now be analyzed in BigQuery and used to power services such as machine learning, speech recognition, translation, word processing and data analysis through an interface familiar structured query language.

It’s a big step but by far not the only one. To further its goals, Google says, it is adding support for major data formats such as Apache Iceberg, Delta Lake, and Apache Hudi into its BigLake storage engine. “By supporting these widely adopted data formats, we can help break down barriers that prevent organizations from getting the most out of their data,” Kazmaier said. “With BigLake, you have the ability to manage data across multiple clouds. We’ll meet you where you are.

Meanwhile, BigQuery is getting a new integration with Apache Spark that will allow data scientists to dramatically improve data processing times. Datastream is also integrated with BigQuery, which will allow customers to more efficiently replicate data from sources such as AlloyDB, PostgreSQL, MySQL and other third-party databases such as Oracle.

To ensure users have greater confidence in their data, Google said it is expanding the capabilities of its Dataplex service, giving it the ability to automate processes associated with improving data quality and lineage. “For example, users will now be able to more easily understand data lineage – where the data came from and how it has transformed and moved over time – reducing the need for manual, time-consuming processes,” Kazmaier said.

Unified business intelligence

Making data more accessible is one thing, but customers also need to be able to work with that data. To that end, Google announced that it would unify its portfolio of business intelligence tools under the Looker umbrella. Looker will be integrated with Data Studio and other core BI tools to simplify how users can get insights from their data.

As part of the integration, Data Studio is rebranded as Looker Studio, helping customers go beyond examining dashboards by infusing their workflows and applications with out-of-the-box intelligence to make it easier to data-driven decision-making, Google said. Looker will, for example, be integrated with Google Workspace, providing easier access to information from productivity tools such as Sheets.

In addition, according to Google, it will be easier for customers to work with the BI tools of their choice. Looker already integrates with Tableau Software for example, and soon it will do the same with Microsoft Power BI.

Powering artificial intelligence

One of the most common use cases for data today is powering AI services – an area where Google is a clear leader. He also doesn’t plan on dropping that lead anytime soon. In an effort to make computer vision and AI-based image recognition more accessible, Google is launching a new service called Vertex AI Vision.

The service extends the capabilities of Vertex AI, providing an end-to-end application development environment for visual data ingestion, analysis, and storage. So users will be able to stream videos from manufacturing plants to create AI models that can improve safety, or take video footage from store shelves to better manage product inventory, Google said.

“Vertex AI Vision can reduce computer vision application build time from weeks to hours for one-tenth the cost of current offerings,” Kazmaier explained. “To achieve these efficiencies, Vertex AI Vision provides an easy-to-use drag-and-drop interface and a library of pre-trained ML models for common tasks such as occupancy counting, product recognition, and object detection. “

For less technical users, Google is introducing more “AI Agents,” which are tools that make it easy for anyone to apply AI models to common business tasks, making the technology accessible to almost anyone. .

New AI agents include Translation Hub, which enables self-service document translation with support for an impressive 135 languages ​​at launch. Translation Hub integrates technologies such as Google Neural Machine Translation and AutoML and works by ingesting and translating content from multiple document types, including Google Docs, Word, Slides and PDFs. Not only does it preserve exact layout and formatting, but it also comes with granular management controls, including support for in-loop human comment post-editing and document review.

Through Translation Hub, researchers could share important documents with their colleagues around the world, while suppliers of goods and services could reach underserved markets. Additionally, Google said, public sector administrators can reach more community members in their native language.

A second new AI agent is Document AI Workbench, which makes it easy to create custom document parsers that can be trained to extract and summarize key information from large documents. “Document AI Workbench can remove barriers to building custom document analyzers, helping organizations extract areas of interest specific to their business needs,” said June Yang, vice president of cloud AI and industry solutions.

Google also introduced Document AI Warehouse, which is designed to remove the challenge of markup and extracting data from documents.

Extensive integrations

Finally, Google said it is expanding its integrations with some of the most popular enterprise data platforms to ensure the information stored there is also accessible to its customers.

Kazmaier explained that providing customers with the flexibility to work on any data platform is key to ensuring choice and avoiding data lock-in. With that in mind, he said, Google is committed to working with all major enterprise data platform providers, including Collibra NV, Databricks Inc., Elastic NV, FiveTran Inc., MongoDB Inc., Reltio Inc. and Strimm Ltd. , to make sure their tools work with their products.

David Meyer, senior vice president of product management at Databricks, told SiliconANGLE in an interview that the company has been working with Google for about two years on BigQuery supporting Databricks’ Delta Lake, following similar work. with Amazon Web Services Inc. and Microsoft Corp. is azure.

“Making sure you don’t have to move data out of your data lake reduces cost and complexity,” Meyer said. “We see this as an inflection point.” Even so, he added, this is just the beginning of work with Google Cloud, and the two companies will work to resolve other challenges, such as joint governance efforts.

Kazmaier said the company is also working with the 17-member Data Cloud Alliance to promote open standards and interoperability in the data industry. It also continues to support open source database engines such as MongoDB, MySQL, PostgreSQL, and Redis, as well as Google Cloud databases such as AlloyDB for PostgreSQL, Cloud Bigtable, Firestore, and Cloud Spanner.

With reporting by Robert Hof

Image: Google

Show your support for our mission by joining our Cube Club and our Cube Event community of experts. Join the community that includes Amazon Web Services and CEO Andy Jassy, ​​Dell Technologies Founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many other luminaries and experts.


Comments are closed.