Go library to read/write Parquet files. Developed at Twilio Segment
-
Updated
May 31, 2024 - Go
Go library to read/write Parquet files. Developed at Twilio Segment
MongoDB integrations for Apache Arrow. Export MongoDB documents to numpy array, parquet files, and pandas dataframes in one line of code.
This project demonstrates efficient stream processing, data archiving, and real-time analysis of weather data using modern data-intensive application techniques
A lightweight Java library that facilitates reading and writing Apache Parquet files without Hadoop dependencies
Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.
big data project, information storage in hdfs
Easily convert JSON data into Parquet format for efficient storage and analysis. Simplify data processing and analysis pipelines by converting JSON objects into optimized Parquet files.
Processing and exporting data from EPW files into other formats.
FegTec é uma empresa fictícia que quer transferir arquivos parquet contendo dados dos clientes da nuvem AWS para a Google Cloud
NOAA data pipeline, queryable from the browser
Simple and small CLI to work with parquet files
A summative coursework for CSC8101 Engineering for AI
Glue Data Quality Example - Deploy to your AWS Account w/ Terraform to test
Processes S3 Inventory Manifests and generates a report about the folder size and object size average
Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.
Threat Detection and Visualization
OSM planet dump high performance data loader. Transform OpenStreetMap World/Region PBF dump into partitioned by H3 regions PostGIS pgsnapshot (lossless) OSM schema representation and/or into ArrowIPC/Parquet dumps
Add a description, image, and links to the parquet-files topic page so that developers can more easily learn about it.
To associate your repository with the parquet-files topic, visit your repo's landing page and select "manage topics."