[CUBIG DTS] Privacy-Safe Synthetic Data for Text, Tabular, and Images

Generate privacy-safe synthetic data across text, tabular, and image data with differential privacy, without exposing original customer records. CUBIG DTS helps regulated teams augment scarce data, correct class imbalance, replace missing values, and improve AI training and analytics.

View purchase options

Overview

Try agent mode

Create proposal

Ask question

CUBIG DTS is an enterprise synthetic data engine for organizations that cannot freely use original data because of privacy, access, or data-quality constraints. It generates privacy-safe synthetic data across text, tabular, and image formats using differential privacy and zero-access processing, so raw customer records never leave the client environment. Teams can augment scarce datasets, correct class imbalance, replace missing values, and build higher-utility training data for AI and analytics workflows. DTS applies differential privacy at generation time and operates in a zero-access architecture where the synthetic output, not the original records, is what crosses the security boundary. This approach helps teams in finance, healthcare, and the public sector work with representative data without exposing regulated records. Because DTS produces entirely new data points rather than masking or transforming originals, the output is structurally distinct from de-identified or anonymized copies of source data. Common use cases include replacing restricted datasets that cannot leave the security perimeter, generating balanced training sets for fraud detection or medical diagnosis models, filling coverage gaps where minority classes are underrepresented, and supplementing missing values in incomplete records. DTS runs on GPU infrastructure and provides a container-based deployment that integrates with Amazon ECS and Amazon EKS, so teams can scale synthetic data generation within their existing AWS environment.

Highlights

Multimodal synthetic data across text, tabular, and image workflows
Differential privacy and zero-access processing so raw records stay inside the client environment
Augment scarce data, correct class imbalance, and replace missing values for better AI training and analytics

Details

Sold by

CUBIG

Introducing multi-product solutions

You can now purchase comprehensive solutions tailored to use cases and industries.

Learn more

Explore multi-product solutions

Features and programs

Financing for AWS Marketplace purchases

AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.

View financing details

Pricing

[CUBIG DTS] Privacy-Safe Synthetic Data for Text, Tabular, and Images

Info

View purchase options

Pricing is based on the duration and terms of your contract with the vendor. This entitles you to a specified quantity of use for the contract duration. If you choose not to renew or replace your contract before it ends, access to these entitlements will expire.

Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator to estimate your infrastructure costs.

1-month contract (1)

Info

Dimension	Description	Cost/month
DTS Server	DTS pricing varies based on data volume, data type/format, and your consumption needs. Listed prices are placeholders. Please contact our sales team at contact@cubig.ai to request a custom private offer tailored to your requirements.	$9,500.00

Vendor refund policy

Please reach us at contact@cubig.ai for refund policy.

How can we make this page better?

Tell us how we can improve this page, or report an issue with this product.

Legal

Vendor terms and conditions

Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

Content disclaimer

Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

Usage information

Info

Delivery details

Docker Container - GPU Required

Supported services: Learn more

Amazon ECS
Amazon EKS

Container image

Containers are lightweight, portable execution environments that wrap server application software in a filesystem that includes everything it needs to run. Container applications run on supported container runtimes and orchestration services, such as Amazon Elastic Container Service (Amazon ECS) or Amazon Elastic Kubernetes Service (Amazon EKS). Both eliminate the need for you to install and operate your own container orchestration software by managing and scheduling containers on a scalable cluster of virtual machines.

Version release notes

Security Update - SQLite 3.50.2

This release addresses CVE-2025-6965, an integer truncation and memory corruption vulnerability in SQLite. The bundled SQLite library has been upgraded from 3.37.2 to 3.50.2 in the Docker runtime image.

Changes

Upgraded SQLite to 3.50.2 to resolve CVE-2025-6965
Added symbolic links to ensure Python loads the patched SQLite library
Refactored Dockerfile to copy pip packages directly from builder stage, avoiding runtime source-build issues
No functional changes to the DTS API or synthetic data generation pipeline
Backward compatible with v1.0.2

Verification

sqlite3.sqlite_version reports 3.50.2 in the running container
Health endpoint (GET /health) returns 200 OK
All core modules (interface, utils, faiss, torch) import successfully

Recommended Action All users should upgrade to v1.0.4 for the security fix.

Additional details

Usage instructions

DTS AI Module - Usage Instructions

Quick Start

Step 1: Run the Container

docker run -d
--name dts-ai-module
--gpus all
-p 8000:8000
-v /path/to/your/data:/data
-v /path/to/output:/results
709825985650.dkr.ecr.us-east-1.amazonaws.com/cubig-ai/dts-v1:1.0.2

Replace /path/to/your/data and /path/to/output with your actual host paths.

Step 2: Verify Health

Wait 30 seconds for service startup:

curl http://localhost:8000/health

Expected response: { "status": "healthy", "service": "DTS AI Module", "version": "1.0.2" }

Step 3: Access Documentation

Quick Start: http://localhost:8000/
API Docs: http://localhost:8000/docs (Swagger UI)
User Guide: docker exec dts-ai-module cat /app/USER_GUIDE.md

API Usage Examples

Example 1: Image Synthetic Data

Generate 100 synthetic medical X-ray images:

curl -X POST http://localhost:8000/generate
-H "Content-Type: application/json"
-d '{ "run_id": "medical_xray_001", "modality": "image", "private_data_path": "/data/xray_images", "num_samples": 100, "iteration": 2, "epsilon": 8.0, "delta": 1e-6, "positive_prompt": "a medical chest x-ray image, high quality", "image_width": 512, "image_height": 512, "gpu_id": "0" }'

Result: Synthetic images saved to /results/medical_xray_001/final/synthetic_final/

Example 2: Text Synthetic Data

Generate 1,000 synthetic medical texts:

curl -X POST http://localhost:8000/generate
-H "Content-Type: application/json"
-d '{ "run_id": "medical_text_001", "modality": "text", "private_data_path": "/data/diagnosis_texts.csv", "num_samples": 1000, "iteration": 3, "epsilon": 8.0, "domain": "medical", "sub_domain": "diagnosis", "gpu_id": "0" }'

Result: Synthetic texts saved to /results/medical_text_001/final/synthetic_final.csv

Example 3: Tabular Synthetic Data

Generate 5,000 synthetic customer records:

curl -X POST http://localhost:8000/generate
-H "Content-Type: application/json"
-d '{ "run_id": "customer_data_001", "modality": "tabular", "private_data_path": "/data/customer_dataset.csv", "num_samples": 5000, "iteration": 2, "epsilon": 8.0, "numerical_columns": ["age", "income", "credit_score"], "categorical_columns": ["gender", "education", "occupation"], "gpu_id": "0" }'

Result: Synthetic data saved to /results/customer_data_001/final/synthetic_final.csv

Privacy Parameters Guide

Epsilon - Privacy Budget

Lower values: Stronger privacy protection, lower data utility
Higher values: Weaker privacy protection, higher data utility
Recommended: 8.0 for balanced privacy-utility tradeoff
Range: 0.1 (very strong) to 10.0 (moderate)

Delta - Privacy Failure Probability

Recommended: 1e-6 (should be less than 1/N, where N is number of samples)

Iteration - Quality Improvement

1: Fast, lower quality
2-3: Balanced (recommended)
4-5: Slower, higher quality

Troubleshooting

Issue: Connection refused to port 8000 Solution: Wait 30-60 seconds for service startup, then retry

Issue: GPU not found Solution: Verify NVIDIA Container Toolkit: docker run --gpus all nvidia/cuda:12.1.0-base-ubuntu22.04 nvidia-smi

Issue: private_data_path not found Solution: Use container internal path (/data/...), not host path

Issue: Out of memory Solution: Reduce num_samples, iteration, or image dimensions

Support

For technical support:

Check API documentation at http://localhost:8000/docs
Review user guide: docker exec dts-ai-module cat /app/USER_GUIDE.md
Contact AWS Marketplace support

Support

Vendor support

Please reach us at contact@cubig.ai for any assistance or questions.

AWS infrastructure support

AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

Get support

Similar products

[LLM Capsule] PII De-identification & Reconstruction for External LLMs

By CUBIG

Protect sensitive enterprise data before prompts reach approved external LLMs or AI agents. CUBIG LLM Capsule detects regulated fields, pseudonymizes or tokenizes them inline, and restores authorized values only after response return, helping teams use external AI without sending raw customer data outside.

View product

Customer reviews

Leave a review

Ratings and reviews

Info

0 ratings

5 star

4 star

3 star

2 star

1 star

0 reviews

No customer reviews yet

Be the first to review this product . We've partnered with PeerSpot to gather customer feedback. You can share your experience by writing or recording a review, or scheduling a call with a PeerSpot analyst.