IT Brief Asia - Technology news for CIOs & IT decision-makers
Story image
Snowflake announces Snowpark for Python and other updates
Thu, 16th Jun 2022
FYI, this story is more than a year old

Snowflake has revealed new enhancements to improve programmability for data scientists, data engineers and application developers.

The latest updates bring Python to the forefront, with the launch of Snowpark for Python - now in public preview, and a native integration with Streamlit for rapid application development and iteration, currently in development.

Additionally, Snowflake is streamlining access to more data with new updates for working with streaming data, alongside making data stored in open formats and on-premises available in the Data Cloud.

The introduction of Snowpark, Snowflake's developer framework, opens up a programming environment for data scientists, data engineers, and application developers to build scalable pipelines, applications, and machine learning (ML) workflows directly in Snowflake using their preferred languages and libraries.

Snowflake is further extending what users can build with Snowpark for Python, making Python's ecosystem of open-source packages and libraries accessible in the Data Cloud.

With a highly secure Python sandbox, Snowpark for Python runs on the same Snowflake compute infrastructure as Snowflake pipelines and applications written in other languages.

Developers now have the opportunity to streamline and modernise their data processing architecture by consolidating their Python-based data processing in Snowflake using Snowpark.

Additional updates complementing Snowpark for Python include the following:

  • Snowflake Worksheets for Python, now in private preview, enables users to develop pipelines, ML models, and applications directly in Snowsight, Snowflakes user interface, using Python and Snowparks DataFrame APIs for Python, streamlining development with code auto-complete, and the ability to productise custom logic in seconds.
  • Snowflake's Streamlit Integration, currently in development, brings Python-based application development directly into Snowflake, enabling users to build interactive applications, and securely share, iterate, and collaborate with business teams to increase the impact of development.
  • Large Memory Warehouses, currently in development, empowers users to securely execute memory-intensive operations such as feature engineering and model training on large datasets using popular Python open-source libraries available through the Anaconda integration. SQL Machine Learning, starting with time-series forecasting now in private preview, empowers SQL users to embed ML-powered predictions into their everyday business intelligence and analytics to improve decision quality and speed.
  • Python's ecosystem of open-source packages is a top choice for developers, and Snowflake's continued partnership with Anaconda extends access to more Python packages in Snowflake, with all code running in a highly secure sandboxed environment. The Snowpark Accelerated program has also seen continued growth in large part due to Snowflakes Python advancements, with more partners building with Python to extend the power of the Data Cloud in their language of choice, the company states.

Getting access to the right data quickly and efficiently is critical for improving developer productivity, building ML models with increased accuracy, and delivering more powerful applications, Snowflake states. According to the company, the latest enhancements enable teams to experiment faster, with more data at their fingertips, driving increased programming capabilities and deeper insights for users.

New innovations include:

  • Streaming Data Support to eliminate the boundaries between streaming and batch pipelines with Snowpipe Streaming, now in private preview, for serverless ingestion of streaming data, and Materialised Tables, currently in development, which make it simple to transform streaming data declaratively.
  • Iceberg Tables in Snowflake, currently in development, to enable users to work with Apache Iceberg, a popular open table format, in external storage while taking advantage of the Snowflake platform, simplifying overall data management and enabling architectural flexibility.
  • External Tables for On-Premises Storage, now in private preview, to allow users to access their data in on-premises storage systems like Dell Technologies, Pure Storage, and more from Snowflake so they can benefit from the elasticity of the Data Cloud without moving this data.

Snowflake senior vice president Christian Kleinerman says, “We are heavily investing in Python to make it easier for data scientists, data engineers, and application developers to build even more in the Data Cloud, without governance trade-offs.

"Our latest innovations extend the value of our customers data-driven ecosystems, enabling them with more access to data and new ways to develop with it directly in Snowflake.

"These capabilities, paired with Snowflake's best of class data security and privacy, are changing the way teams experiment, iterate, and collaborate with data to drive value.