Technology and Consulting Firm Releases Its First Open Source Project, Addressing Common Data Analytics Challenges for Machine Learning and More
Maven Wave, an Atos company, announces the launch of data-describe, an open source toolkit for inspecting, illuminating, and investigating enormous amounts of unknown data. Maven Wave’s first open source project, data-describe, solves common exploratory data analysis (EDA) challenges such as repetitive tasks, data size limitations, privacy issues with machine learning data, and more. With enterprise data sets rapidly growing in size and data analysis applications multiplying constantly, data-describe helps data scientists streamline EDA so they can focus on analysis over coding.
With unknown “dark” data, “unclean” data, structured and unstructured data, and data embedded in images and documents, data scientists often struggle to establish a clear understanding of their data environments. By profiling data and revealing its true landscape, data-describe offers a rich set of tools chained together to automate common data analysis tasks. A team with more than 40 years of data science experience built this new toolset, including Maven Wave’s Managing Director and Data Science Lead, Brian Ray, and well-known open source contributor and Maven Wave consultant Yuan Tang.
data-describe features include:
- Streamlined data summaries for important statistics
- Clustering functionality
- Out-of-the-box correlation matrices
- Heatmaps that quickly visualize data outliers and missing values
“After using data-describe for the initial data exploration phase of several projects, we were amazed at the value it delivers,” says Brian Ray, Managing Director, Global Data Science Lead at Maven Wave. “Allowing us to tackle data obstacles quickly, data-describe cuts in half the time it takes to set up projects and complete common initial data tasks. The Maven Wave data science team solves complex analytical problems for enterprises through the power of data science combined with cloud enablement; data-describe allows us to accelerate that process so our clients can more effectively and economically adopt emerging technology.”
Get the latest industry news and insights delivered straight to your inbox.