Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 5 additions & 61 deletions doc/getting_started/tutorials/13.ctable-basics.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -774,12 +774,7 @@
"cell_type": "markdown",
"id": "4f466e5d",
"metadata": {},
"source": [
"### 3.3 Sorting\n",
"\n",
"`sort_by()` returns a sorted copy by default (or sorts in-place with `inplace=True`).\n",
"Multi-column sorting is supported — primary key first."
]
"source": "### 3.3 Sorting\n\n`sort_by()` returns a sorted copy by default (or sorts in-place with `inplace=True`).\nPass `view=True` for a zero-copy sorted **view** that shares the table's data and gathers\nrows on demand — ideal for reading a sorted slice of a large table without copying it.\nMulti-column sorting is supported — primary key first."
},
{
"cell_type": "code",
Expand Down Expand Up @@ -1197,37 +1192,9 @@
"start_time": "2026-05-21T09:38:01.039615Z"
}
},
"source": [
"# Top 10 hottest days in Madrid across the whole year\n",
"# Sort the full table, then filter — views cannot be sorted directly\n",
"hottest_all = climate.sort_by(\"temperature\", ascending=False)\n",
"madrid_sorted = hottest_all.where(hottest_all.city == \"Madrid\")\n",
"print(\"10 hottest days in Madrid:\")\n",
"print(madrid_sorted.select([\"city\", \"day\", \"temperature\", \"humidity\"]).head(10))"
],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"10 hottest days in Madrid:\n",
" city day temperature humidity\n",
"0 Madrid 191 31.399208 42.543335\n",
"1 Madrid 190 31.232576 44.303246\n",
"2 Madrid 227 31.227442 46.992290\n",
"3 Madrid 194 30.915184 35.044228\n",
"4 Madrid 186 30.879374 48.080303\n",
"5 Madrid 202 30.745684 43.722813\n",
"6 Madrid 177 30.469023 38.390163\n",
"7 Madrid 163 30.215179 46.051888\n",
"8 Madrid 181 30.181025 43.726521\n",
"9 Madrid 184 29.936199 50.654797\n",
"\n",
"[10 rows x 4 columns]\n"
]
}
],
"execution_count": 21
"source": "# Top 10 hottest days in Madrid across the whole year.\n# Views *can* be sorted: sort_by() on a where()-view returns a zero-copy sorted\n# view — it shares the table's columns and gathers rows on demand, no full-table\n# copy. (On a base table, pass view=True for the same lazy behaviour.)\nmadrid = climate.where(climate.city == \"Madrid\")\nmadrid_sorted = madrid.sort_by(\"temperature\", ascending=False)\nprint(\"10 hottest days in Madrid:\")\nprint(madrid_sorted.select([\"city\", \"day\", \"temperature\", \"humidity\"]).head(10))",
"outputs": [],
"execution_count": null
},
{
"cell_type": "markdown",
Expand Down Expand Up @@ -2876,30 +2843,7 @@
"cell_type": "markdown",
"id": "405cd155",
"metadata": {},
"source": [
"---\n",
"## Summary\n",
"\n",
"Here's everything we covered:\n",
"\n",
"| Feature | API |\n",
"|---------|-----|\n",
"| Create | `CTable(Schema)`, `CTable(Schema, new_data=...)` |\n",
"| Insert | `append(row)`, `extend(list_or_array)` |\n",
"| View | `head()`, `tail()`, `print(t)`, `t.info()` |\n",
"| Filter | `where(expr)` → view |\n",
"| Project | `select([cols])` → view |\n",
"| Sort | `sort_by(cols)`, `sort_by(cols, inplace=True)` |\n",
"| Aggregates | `col.sum()`, `.mean()`, `.std()`, `.min()`, `.max()` |\n",
"| Stats | `describe()`, `cov()` |\n",
"| Mutate | `delete()`, `compact()`, `add_column()`, `drop_column()`, `assign()` |\n",
"| Persist | `save(path)`, `to_b2z()`, `to_b2d()`, `CTable.open(path)`, `CTable.load(path)` |\n",
"| Interop | `to_arrow()`, `from_arrow()`, `to_csv()`, `from_csv()` |\n",
"| Nullable | `null_value=` on spec, `is_null()`, `notnull()`, `null_count()` |\n",
"\n",
"CTable is designed for **compressed analytical workloads** — large tables that need to stay small in RAM\n",
"while still being fast to query and easy to persist."
]
"source": "---\n## Summary\n\nHere's everything we covered:\n\n| Feature | API |\n|---------|-----|\n| Create | `CTable(Schema)`, `CTable(Schema, new_data=...)` |\n| Insert | `append(row)`, `extend(list_or_array)` |\n| View | `head()`, `tail()`, `print(t)`, `t.info()` |\n| Filter | `where(expr)` → view |\n| Project | `select([cols])` → view |\n| Sort | `sort_by(cols)`, `sort_by(cols, view=True)`, `sort_by(cols, inplace=True)` |\n| Aggregates | `col.sum()`, `.mean()`, `.std()`, `.min()`, `.max()` |\n| Stats | `describe()`, `cov()` |\n| Mutate | `delete()`, `compact()`, `add_column()`, `drop_column()`, `assign()` |\n| Persist | `save(path)`, `to_b2z()`, `to_b2d()`, `CTable.open(path)`, `CTable.load(path)` |\n| Interop | `to_arrow()`, `from_arrow()`, `to_csv()`, `from_csv()` |\n| Nullable | `null_value=` on spec, `is_null()`, `notnull()`, `null_count()` |\n\nCTable is designed for **compressed analytical workloads** — large tables that need to stay small in RAM\nwhile still being fast to query and easy to persist."
}
],
"metadata": {
Expand Down
Loading
Loading