Models in DynamoBao are defined using YAML configuration files. These definitions are used to generate the corresponding model classes with all necessary fields, indexes, and relationships.
Basic Structure
models:
ModelName:
modelPrefix: "x" # Required: short (1-4 unique character prefix) for the model. This must be unique and cannot change later. It is not user visible.
tableType: "standard" # Optional: "standard" (default) or "mapping"
fields: {} # Required: field definitions
primaryKey: {} # Required: primary key configuration
indexes: {} # Optional: secondary indexes
uniqueConstraints: {} # Optional: unique constraints
Field Types
Fields can be defined with various types and options:
Basic Fields
fields:
stringField:
type: StringField
required: true # Optional: makes the field required
integerField:
type: IntegerField
defaultValue: 0 # Optional: default value
floatField:
type: FloatField
booleanField:
type: BooleanField
defaultValue: false
binaryField:
type: BinaryField # Stores Buffer or Uint8Array data
Date and Time Fields
fields:
dateTime:
type: DateTimeField # Stores any date/time value
createdAt:
type: CreateDateField # Automatically sets creation timestamp
modifiedAt:
type: ModifiedDateField # Automatically updates on modification
ttl:
type: TtlField # Time-to-live field for automatic deletion
UlidField
The UlidField
is a special field that automatically generates ULID values. It is recommended to use this field for the primary key of your model.
fields:
postId:
type: UlidField
autoAssign: true
ULIDs are 128-bit values that are lexicographically sortable and can be generated in parallel. They are designed to be collision resistant and are compatible with the UUID format.
They are especially useful for dynamo since they can be used as sort keys and sort by the date they were created. This makes the default sort order when an id is used feel more natural than being purely random or requiring a secondary index to sort by creation date.
VersionField
The VersionField
is a special field that automatically generates a new larger version every time the object is saved. It is recommended to use this field for the version of your model. Note that this is a string, not a number.
fields:
version:
type: VersionField
CounterField
The CounterField
is a special field that allows you to increment and decrement a counter atomically. This is useful for creating counters that can be incremented/decremented across users without having to worry about race conditions.
fields:
counter:
type: CounterField
TtlField
The TtlField
is a special field that allows you to automatically delete an object after a certain amount of time. This is useful for creating time-based TTLs for objects. The field must be named ttl
.
fields:
ttl:
type: TtlField
RelatedField
The RelatedField
is a special field that allows you to reference another model. This is useful for creating relationships between models.
fields:
userId:
type: RelatedField
model: User
required: true
This field will automatically generate a getter method on the model that allows you to query for the related model. Two of these fields can be used to create a many-to-many relationship. See the Mapping Tables section for more information.
Complete Field Type Reference
StringField
: Text dataIntegerField
: Whole numbersFloatField
: Decimal numbersBooleanField
: True/false valuesBinaryField
: Binary data (Buffer or Uint8Array)DateTimeField
: Date and time valuesCreateDateField
: Automatic creation timestampModifiedDateField
: Automatic modification timestampUlidField
: Unique identifiers (often used for IDs)VersionField
: Automatic version tracking for optimistic lockingCounterField
: Atomic counters that can be incremented/decrementedRelatedField
: References to other modelsTtlField
: Time-to-live field for automatic item deletion (field must be namedttl
)
Field Options
Common options that can be applied to fields:
fields:
exampleField:
type: StringField
required: true # Field must have a value
defaultValue: "test" # Default value if none provided
Primary Key Configuration
Every model requires a primary key configuration:
primaryKey:
partitionKey: "userId" # Required: field to use as partition key
sortKey: "createdAt" # Optional: field to use as sort key
The primary key's partitionKey
and sortKey
can never change once the model is created. If you need to change them, you will need to delete the object and create a new object with the new partitionKey
and sortKey
. With this in mind, it is recommended to choose a partitionKey
and sortKey
that will not change.
In addition to specifying a field as a partitionKey
or sortKey
, you can also specify the modelPrefix
for either the partitionKey
or sortKey
. This is useful if you want to query all the objects of a particular model . However, if you expect this particular model to grow large, you should not use the modelPrefix as the partitionKey since all objects of this model will be stored in the same partition.
If the sortKey is not specified, it will use the modelPrefix as the sortKey. This is handled automatically and not something you need to specify.
Indexes
You can add up to 3 secondary indexes (gsi1
, gsi2
, gsi3
) to a model. Much like a primary key, each index has a partitionKey
and sortKey
. However, unlike a primary key, you can change the partitionKey
and sortKey
at any time.
Indexes incur additional costs, so you should only add them if you need them. In particular, if you are creating an index to support an infrequent query, it would be good to test the query with a filter rather than an index to see if the index necessary before adding it.
Secondary indexes enable additional query patterns:
indexes:
tagsForPost:
partitionKey: "postId" # Field to use as partition key
sortKey: "tagId" # Field to use as sort key
indexId: "gsi1" # GSI identifier (gsi1, gsi2, etc.)
# Reference to primary key
postsForTag: primaryKey
Adding the primaryKey
to the index is a shorthand that allows you to give the primary key a name and let you query by it. It does not create a new index. If you plan to query by the primary key, it is recommended to name it this way.
Iterating Over All Objects for a Model
If you want to iterate over all the objects for a model, you can set the modelPrefix
as the partitionKey
for the model. This will allow you to query the model by the modelPrefix
and iterate over all the objects of that model.
Here's an example:
models:
User:
modelPrefix: "u"
fields:
userId:
type: UlidField
autoAssign: true
primaryKey:
partitionKey: "userId"
indexes:
allUsers:
partitionKey: "modelPrefix"
sortKey: "userId"
If you don't set this up, there is no efficient way to iterate over all the objects for a given model. However, if you expect this model to grow large, you should not use the modelPrefix as the partitionKey
since all objects of this model will be stored in the same partition. This is a fundamental tension present in Dynamo single table design.
One way of addressing this is to choose a root model and enable iteration on it. This allows you to iterate over all the objects for the root model and use that to access all the other data in the system.
For instance, a User
model could have an allUsers
index that allows you to iterate over all the users in the system. If all other data in the system is owned by a user, you can iterate over all the users to get all the data in the system.
A good root model isn't too large and doesn't get written to frequently. For instance, a User
model may be a good candidate, since you typically have many fewer users than posts, messages, or other objects in the system. Users also don't get written to as frequently as other objects, so it's a good fit for iteration. However, you should be aware that if you ever try to write lots of users to dynamo quickly (e.g. bulk importing users without rate limiting), you may hit write capacity limits since it's all on a single partition.
It is not recommended to use the modelPrefix
as the primaryKey's partitionKey, since it cannot be changed later. If you do enable iteration, I recommend using an index with the modelPrefix
as the partitionKey since you can always stop using the index if you run into capacity issues with no issues to the rest of the system.
If you anticipate having a root model that is very large (1M+ objects), you will probably want to use a partitionKey
that is the modelPrefix and a short hash of the object id. This will allow you to iterate over all the objects while also splitting the data across multiple partitions. If you need this, please open an issue and we can discuss further.
Unique Constraints
You can specify up to 3 unique constraints (uc1
, uc2
, uc3
) for a model.
uniqueConstraints:
uniqueEmail:
field: "email"
uniqueConstraintId: "uc1" # Unique constraint identifier
Unique constraints are implemented using Dynamo transactions, so they incur significant costs. Only use them where you really need them (e.g. to prevent duplicate emails, or map to a globally unique external id). Primary keys are already unique, so you may be able to use them instead, though keep in mind that they can't be changed later.
Creating a unique constraint will also enable fast lookups for that field. Before creating an index on a field that has a unique constraint, consider whether you can use the unique constraint instead.
Mapping Tables
Mapping tables are used to create many-to-many relationships between models. They are defined using tableType: "mapping"
and typically contain RelatedFields to the models being connected.
Basic Structure
ModelName:
tableType: "mapping" # Identifies this as a mapping table
modelPrefix: "xx" # Required: 2-character prefix
fields:
firstModelId: # Related field to first model
type: RelatedField
model: FirstModel
required: true
secondModelId: # Related field to second model
type: RelatedField
model: SecondModel
required: true
createdAt: # Optional: tracking fields
type: CreateDateField
primaryKey: # Define how records are stored
partitionKey: firstModelId
sortKey: secondModelId
Real Example: TaggedPost
Here's a complete example of a mapping table that connects Tags to Posts:
TaggedPost:
tableType: "mapping"
modelPrefix: "tp"
fields:
tagId:
type: RelatedField
model: Tag
required: true
postId:
type: RelatedField
model: Post
required: true
createdAt:
type: CreateDateField
primaryKey:
partitionKey: tagId
sortKey: postId
indexes:
postsForTag: primaryKey # Query posts for a tag
tagsForPost: # Query tags for a post
partitionKey: postId
sortKey: tagId
indexId: gsi1
recentPostsForTag: # Query posts for a tag by date
partitionKey: tagId
sortKey: createdAt
indexId: gsi2
Generated Methods
When you create a mapping table, the following methods are automatically generated on the related models:
For the Tag model:
// Get all posts for a tag
async getPosts(mapSkCondition=null, limit=null, direction='ASC', startKey=null)
// Get posts for a tag ordered by creation date
async getRecentPosts(mapSkCondition=null, limit=null, direction='ASC', startKey=null)
For the Post model:
// Get all tags for a post
async getTags(mapSkCondition=null, limit=null, direction='ASC', startKey=null)
For many-to-many relationships, the methods are called get
rather than query
because it's only possible to query on the mapping table, not the tags/posts themselves. In practice, you end up paging through these objects in the order of the mapping table rather than filtering them, so it's more like a get than a query. For one-to-many relationships, we add query
methods to the related model since it's possible to query the related model objects.
Best Practices
- Naming Convention: Name mapping tables by combining the two model names (e.g.,
TaggedPost
,UserGroup
) - Required Fields: Make the relationship fields required to ensure data integrity
- Tracking Fields: Include
createdAt
if you need to track when relationships were created - Indexes: Create indexes for querying from both directions
- Sort Keys: Consider using
createdAt
as a sort key in secondary indexes for time-based queries
Common Index Patterns
For a mapping table connecting ModelA and ModelB, you will want to use the primaryKey to relate one model to the other. Then you will also want to create an index to query the reverse (assuming you want to be able to query the reverse side of the relationship).
indexes:
# Primary access pattern
bsForA: primaryKey # Query Bs for an A
# Reverse lookup
asForB: # Query As for a B
partitionKey: modelBId
sortKey: modelAId
indexId: gsi1
Complete Example
models:
Post:
modelPrefix: "p"
fields:
postId:
type: UlidField
autoAssign: true
userId:
type: RelatedField
model: User
required: true
title:
type: StringField
required: true
content:
type: StringField
createdAt:
type: CreateDateField
primaryKey:
partitionKey: postId
indexes:
postsForUser:
partitionKey: userId
sortKey: createdAt
indexId: gsi1
allPosts:
partitionKey: modelPrefix
sortKey: postId
indexId: gsi2
Generated Features
The model generator will automatically create:
- Field definitions with validation
- Primary key configuration
- Secondary index queries
- Related field getters (e.g.,
getUser()
for auserId
field) - Unique constraint validators
- Query methods for indexes
- Mapping table relationship helpers
Best Practices
- Use clear, descriptive names for models and fields
- Always include a
createdAt
field for tracking - Use
UlidField
for ID fields withautoAssign: true
- Define indexes based on your query patterns
- Use mapping tables for many-to-many relationships
- Keep model prefixes unique and short (1-4 characters)