Blob CSV to Table¶
Trigger: Event Grid (Blob) | Guarantee: at-least-once | Complexity: intermediate
Overview¶
The examples/blob-and-file-triggers/blob_csv_to_table/ recipe processes uploaded CSV files and materializes each row into Azure Table Storage. The function wakes up on a blob-created event, downloads the CSV, parses it with csv.DictReader, and upserts each row using a deterministic partition and row key strategy.
This pattern is useful for lightweight ingestion pipelines where files arrive in batches but downstream consumers want queryable row-level records. Table Storage keeps the operational footprint small while still giving you a durable projection of the uploaded file.
When to Use¶
- CSV files arrive in Blob Storage and need row-level indexing.
- A lightweight NoSQL table is enough for the target shape.
- Reprocessing the same file should overwrite or merge existing rows.
When NOT to Use¶
- The file schema is highly variable or nested.
- You need relational constraints or analytical query features.
- A single upload contains more data than one function execution should handle.
Architecture¶
flowchart LR
csv[CSV upload] --> blob[Blob Storage uploads container]
blob --> eg[Event Grid]
eg --> fn[blob_csv_to_table]
fn --> table[Azure Table Storage]
fn --> logs[Structured logs]
Behavior¶
sequenceDiagram
participant Storage as Blob Storage
participant Function as CSV ingestion function
participant Table as Table Storage
Storage->>Function: BlobCreated event
Function->>Storage: Download CSV
loop each row
Function->>Table: upsert_entity(PartitionKey, RowKey, ...)
end
Implementation¶
The Event Grid handler resolves the uploaded blob and loads its rows into the configured table.
@app.event_grid_trigger(arg_name="event")
@with_context
def blob_csv_to_table(event: func.EventGridEvent) -> None:
data = event.get_json()
container, blob_name = _blob_parts(str(data["url"]))
The sample uses BlobClient to download the CSV, TableServiceClient to upsert entities, and azure-functions-logging to record row counts and source filenames.
Run Locally¶
cd examples/blob-and-file-triggers/blob_csv_to_table- Create and activate a virtual environment.
pip install -e ".[dev]"- Copy
local.settings.json.exampletolocal.settings.json. - Start Azurite or configure a real storage account with blob and table endpoints.
- Run
func startand submit a blob-created Event Grid payload for a CSV upload.
Expected Output¶
[Information] Loaded CSV blob into Table Storage blob_name=daily/orders.csv table_name=CsvRows processed=42
Production Considerations¶
- Schema drift: validate required headers before writing any rows.
- Partitioning: choose partition keys that support your most common queries.
- Replay handling: use deterministic keys so the same file can be reloaded safely.
- File size: split very large CSVs or hand them to a batch pipeline.
- Poison data: move malformed files to a quarantine path for operator review.