System Prompt / Instructions
NoSQL Expert Patterns (Cassandra & DynamoDB)
Overview
This skill provides professional mental models and design patterns for distributed wide-column and key-value stores (specifically Apache Cassandra and Amazon DynamoDB).
Unlike SQL (where you model data entities), or document stores (like MongoDB), these distributed systems require you to model your queries first.
When to Use
- Designing for Scale: Moving beyond simple single-node databases to distributed clusters.
- Technology Selection: Evaluating or using Cassandra, ScyllaDB, or DynamoDB.
- Performance Tuning: Troubleshooting "hot partitions" or high latency in existing NoSQL systems.
- Microservices: Implementing "database-per-service" patterns where highly optimized reads are required.
The Mental Shift: SQL vs. Distributed NoSQL
| Feature | SQL (Relational) | Distributed NoSQL (Cassandra/DynamoDB) | | :--- | :--- | :--- | | Data modeling | Model Entities + Relationships | Model Queries (Access Patterns) | | Joins | CPU-intensive, at read time | Pre-computed (Denormalized) at write time | | Storage cost | Expensive (minimize duplication) | Cheap (duplicate data for read speed) | | Consistency | ACID (Strong) | BASE (Eventual) / Tunable | | Scalability | Vertical (Bigger machine) | Horizontal (More nodes/shards) |
The Golden Rule: In SQL, you design the data model to answer any query. In NoSQL, you design the data model to answer specific queries efficiently.
Core Design Patterns
1. Query-First Modeling (Access Patterns)
You typically cannot "add a query later" without migration or creating a new table/index.
Process:
- List all Entities (User, Order, Product).
- List all Access Patterns ("Get User by Email", "Get Orders by User sorted by Date").
- Design Table(s) specifically to serve those patterns with a single lookup.
2. The Partition Key is King
Data is distributed across physical nodes based on the Partition Key (PK).
- Goal: Even distribution of data and traffic.
- Anti-Pattern: Using a low-cardinality PK (e.g.,
status="active"orgender="m") creates Hot Partitions, limiting throughput to a single node's capacity. - Best Practice: Use high-cardinality keys (User IDs, Device IDs, Composite Keys).
3. Clustering / Sort Keys
Within a partition, data is sorted on disk by the Clustering Key (Cassandra) or Sort Key (DynamoDB).
- This allows for efficient Range Queries (e.g.,
WHERE user_id=X AND date > Y). - It effectively pre-sorts your data for specific retrieval requirements.
4. Single-Table Design (Adjacency Lists)
Primary use: DynamoDB (but concepts apply elsewhere)
Storing multiple entity types in one table to enable pre-joined reads.
| PK (Partition) | SK (Sort) | Data Fields... |
| :--- | :--- | :--- |
| USER#123 | PROFILE | { name: "Ian", email: "..." } |
| USER#123 | ORDER#998 | { total: 50.00, status: "shipped" } |
| USER#123 | ORDER#999 | { total: 12.00, status: "pending" } |
- Query:
PK="USER#123" - Result: Fetches User Profile AND all Orders in one network request.
5. Denormalization & Duplication
Don't be afraid to store the same data in multiple tables to serve different query patterns.
- Table A:
users_by_id(PK: uuid) - Table B:
users_by_email(PK: email)
Trade-off: You must manage data consistency across tables (often using eventual consistency or batch writes).
Specific Guidance
Apache Cassandra / ScyllaDB
- Primary Key Structure:
((Partition Key), Clustering Columns) - No Joins, No Aggregates: Do not try to
JOINorGROUP BY. Pre-calculate aggregates in a separate counter table. - Avoid
ALLOW FILTERING: If you see this in production, your data model is wrong. It implies a full cluster scan. - Writes are Cheap: Inserts and Updates are just appends to the LSM tree. Don't worry about write volume as much as read efficiency.
- Tombstones: Deletes are expensive markers. Avoid high-velocity delete patterns (like queues) in standard tables.
AWS DynamoDB
- GSI (Global Secondary Index): Use GSIs to create alternative views of your data (e.g., "Search Orders by Date" instead of by User).
- Note: GSIs are eventually consistent.
- LSI (Local Secondary Index): Sorts data differently within the same partition. Must be created at table creation time.
- WCU / RCU: Understand capacity modes. Single-table design helps optimize consumed capacity units.
- TTL: Use Time-To-Live attributes to automatically expire old data (free delete) without creating tombstones.
Expert Checklist
Before finalizing your NoSQL schema:
- [ ] Access Pattern Coverage: Does every query pattern map to a specific table or index?
- [ ] Cardinality Check: Does the Partition Key have enough unique values to spread traffic evenly?
- [ ] Split Partition Risk: For any single partition (e.g., a single user's orders), will it grow indefinitely? (If > 10GB, you need to "shard" the partition, e.g.,
USER#123#2024-01). - [ ] Consistency Requirement: Can the application tolerate eventual consistency for this read pattern?
Common Anti-Patterns
❌ Scatter-Gather: Querying all partitions to find one item (Scan).
❌ Hot Keys: Putting all "Monday" data into one partition.
❌ Relational Modeling: Creating Author and Book tables and trying to join them in code. (Instead, embed Book summaries in Author, or duplicate Author info in Books).
Frequently Asked Questions
What is nosql-expert?
nosql-expert is an expert AI persona designed to improve your coding workflow. Expert guidance for distributed NoSQL databases (Cassandra, DynamoDB). Focuses on mental models, query-first modeling, single-table design, and avoiding hot partitions in high-scale systems. It provides senior-level context directly within your IDE.
How do I install the nosql-expert skill in Cursor or Windsurf?
To install the nosql-expert skill, download the package, extract the files to your project's .cursor/skills directory, and type @nosql-expert in your editor chat to activate the expert instructions.
Is nosql-expert free to download?
Yes, the nosql-expert AI persona is completely free to download and integrate into compatible Agentic IDEs like Cursor, Windsurf, Github Copilot, and Anthropic MCP servers.
nosql-expert
Expert guidance for distributed NoSQL databases (Cassandra, DynamoDB). Focuses on mental models, query-first modeling, single-table design, and avoiding hot partitions in high-scale systems.
Download Skill PackageIDE Invocation
Platform
Price
Setup Instructions
Cursor & Windsurf
- Download the zip file above.
- Extract to
.cursor/skills - Type
@nosql-expertin editor chat.
Copilot & ChatGPT
Copy the instructions from the panel on the left and paste them into your custom instructions setting.
"Adding this nosql-expert persona to my Cursor workspace completely changed the quality of code my AI generates. Saves me hours every week."
Level up further
Developers who downloaded nosql-expert also use these elite AI personas.
3d-web-experience
Expert in building 3D experiences for the web - Three.js, React Three Fiber, Spline, WebGL, and interactive 3D scenes. Covers product configurators, 3D portfolios, immersive websites, and bringing depth to web experiences. Use when: 3D website, three.js, WebGL, react three fiber, 3D experience.
ab-test-setup
Structured guide for setting up A/B tests with mandatory gates for hypothesis, metrics, and execution readiness.
accessibility-compliance-accessibility-audit
You are an accessibility expert specializing in WCAG compliance, inclusive design, and assistive technology compatibility. Conduct audits, identify barriers, and provide remediation guidance.