Smart computer vision management system
The logistics sector has already been equipping itself with artificial intelligence systems. The key elements of logistics - personnel, goods, and equipment - all require accurate computer vision recognition for the best security protection, in ports, warehouses, and wherever logistics is needed. But more so than ever, the industry is facing an exponentially exploding pool of data in all scenarios, including behavior recognition, key security protection, and aerial monitoring. At the forefront of the industry, pioneers are now using Graviti's data platform as part of their AI infrastructure to supply their algorithm team with the most valuable data and make their AI innovation more efficient.
One of Graviti's clients in logistics:
Our goal is to help our algorithm team access and operate massive datasets in the simplest and fastest way by introducing a standardized data management tool, which is why we are very happy to see that Graviti has entered the market. Unlike traditional companies that fixate on local file systems, Graviti provides us with a brand new cloud-based solution, and we are really excited about working with Graviti and releasing the full potential of our unstructured data.
Progress in AI is now driven by data
In 2021, developers saw the latest trend in AI sweeps globally and brings them MLOps (Machine learning operations), which propels an ongoing debate about whether AI should be "model-centric" or "data-centric". Just a few years ago, the ML community was still confident in the former approach, focused on building the best model and adjusting hyperparameters. Earlier this year, however, one of the most prominent ML scholars, Andrew Ng, released his topical talk MLOps: From Model-centric to Data-centric AI, in which he proclaimed that a "data-centric" era has dawned, and ML development should be conducted with the new zeitgeist in mind.
For more compact algorithm teams, a data-centric strategy is in fact more feasible and actionable than a model-centric one. The lifecycle of a machine learning project goes through four phases: project definition - data collection - model training - model deployment, and the last three phases are constantly iterated throughout the entire project.
Data is just like food for AI, and the industry needs customized datasets to be continuously fed into their models. There are two determining factors for model performance: data and algorithms. Datasets in real life are often noisy, and so to deal with noisy data, we may think of two approaches. One is to tackle the algorithm, and propose algorithms that can handle noise and have generalization capabilities, which is obviously more difficult. The other is to tackle the dataset, and improve the nutrition of what we feed our models. In fact, model improvements are only scalable by using a tool chain and a systematic approach, to improve the quality of the data and constantly feed these data to slightly optimize the model one epoch at a time.
For leaders in the logistics industry, it is imperative that their AI teams focus on improving their datasets to achieve agile iterations of their models and bring true digital innovations to their supply chains.
Logistics, now powered by AI
In 2020, 10% of global GDP was solely contributed by the logistics sector, but the digitalization of the logistics industry still leaves much room to improve. There is a mammoth amount of data hiding in every corner of the industry's daily operations and waiting to be discovered. If used right, these data will make the next round of logistics revolution a true reality.
Although it has become the norm to use cutting-edge technologies such as big data and artificial intelligence to transform traditional industries, a phenomenon that cannot be overlooked is the exponential growth of unstructured data accumulated by monitors and sensors. In the meantime, the lack of matching tool chains and workflows for these unstructured data brings challenges to data-driven innovations. Working with industry-leading partners in logistics, Graviti has done thorough research into the pain points and challenges in their AI development, and knows what logistics companies actually need for data management:
- Consistent data and label formats: easy and flexible extraction and merging of data based on any given parameters;
- Unified storage path: easy to search and access the full amount of data collected from sensors and cover as many scenarios and scarce data samples as possible;
- Version traceability: record and compare the version changes of the dataset during training for error analysis and iterative iterations.
With these key challenges in mind, Graviti has helped its partners developed cutting-edge deep learning engines that integrate data collection, labeling, model iteration, data structuralization, and deployment into one, single AI data workflow.
Graviti's next-generation unstructured data platform is in the driver's seat of new AI innovations. Graviti provides powerful cloud-based features including data hosting, query, collaboration, visualization, and version control, all focused on the pain points of AI development. For their valuable clients in logistics, Graviti provides the following solutions:
1. Easy development starts on the cloud
By having cloud storage authorization, Graviti's data platform can be safely entrusted with raw data, labels, and Meta information. Engineers can easily switch between individual and team workspaces, with a secure permission management system that allows cross-team collaboration and democratizes data access.
Algorithm teams in logistics often need to use the same original dataset for different training attempts in the R&D process. With Graviti, team members can ditch the old copy-and-paste workflow and directly fork datasets on the platform. In these forked branches, members can flexibly adjust the data according to their specific development goals without affecting the original iteration.
2. WYSIWYG: Real-time data insight and version traceability
In the R&D process, algorithm teams need to prepare customized datasets for model training for different logistics monitoring scenarios. Through Graviti's version control function, engineers can now quickly iterate new versions while having clear records and traces of the versioning process, which allow them to finely compare the differences between data and annotations in various fields and significantly improve their productivity.
Usually, annotated data is directly poured into the model for training after quality control, but the caveat is that model performance can be compromised by even the slightest errors in annotations. Now with the visualization component of Graviti's platform, algorithm engineers can not only gain insights into the feature distribution of a dataset at the macro level, but also check individual files and labels at the micro level. This feature allows engineers to view annotation results before using the data for training, greatly reducing time wasted on model debugging and redoing annotation due to sheer substandard data quality.
3. Model is temporary, but pipelines are forever
Compared to honing that one ML model, optimizing and automating the entire machine learning workflow will bring a more profound and lasting advantage to the development process. Graviti is committed to helping its industry partners to achieve deeper integration with their existing AI workflows.
- Automate data workflows with Action: Data processes including collection, filtering, triggering tasks, and uploading can all be automated with Graviti. Algorithm teams no longer have to manually upload data to a cloud drive and then share with their team members.
- Manage the entire AI data lifecycle: Graviti's assistance can be extended to the data annotation process. Annotation results can be directly imported into its cloud platform for further processing, and engineers can achieve real-time modification of the results. Graviti thus makes it possible to have real-time feedback, traceability and optimization directly from the production environment.
Better data means better AI. Graviti believes that the entire AI industry is leaving the old model-centric mindset behind and embracing the new data-centric approach. More and more smart logistics companies will have long-term benefits from working with Graviti and making data quality their top priority.
Leader of a logistics algorithm team:
Graviti's machine learning data platform has become a key part of our AI development process. Its powerful data management works seamlessly with our own workflows, providing us with very convenient features and developer tools for our data preparation. We are glad to work closely with Graviti and achieve our development goals more smoothly than ever.