Skip to content

Conversation

@Xuanwo
Copy link
Collaborator

@Xuanwo Xuanwo commented Feb 3, 2026

This PR adds a basic lance skill as user guide.

Users can install them via

npx skills add lance-format/lance

Parts of this PR were drafted with assistance from Codex (with gpt-5.2) and fully reviewed and edited by me. I take full responsibility for all changes.

@github-actions github-actions bot added the documentation Improvements or additions to documentation label Feb 3, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2026

PR Review

Thanks for adding the Lance user guide skill! This is a useful addition for helping code agents assist Lance users.

P1 Issues

Deprecated API usage - The documentation and example script use num_partitions which is deprecated (see python/python/lance/dataset.py:2754). Consider updating to use target_partition_size instead, or at minimum add a note that num_partitions is deprecated.

Affected locations:

  • SKILL.md line 140: num_partitions=256
  • references/index-selection.md line 39: recommends num_partitions
  • scripts/python_end_to_end.py line 55: uses num_partitions

Minor Suggestions (non-blocking)

  • The valid vector index types are: IVF_FLAT, IVF_PQ, IVF_SQ, IVF_HNSW_FLAT, IVF_HNSW_PQ, IVF_HNSW_SQ, IVF_RQ. The list in index-selection.md is mostly accurate but doesn't include IVF_HNSW_FLAT which might be useful for users who want exact search within HNSW clusters.

Overall the skill content is well-structured with clear workflow decision trees and troubleshooting patterns.

- Equality filters on high-cardinality columns: start with `BTREE`
- Equality / IN-list filters on low-cardinality columns: start with `BITMAP`
- Text search: start with `FTS` (or other text index types supported by the version)
- Range filters: start with range-friendly options (for example `ZONEMAP` when appropriate)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are missing a few here like label list, bloom filter, rtree. Should also mention how to handle json data index.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants