Curvenote Positioning Adoption Wedge And Architecture
Sources: 1 • Confidence: Medium • Updated: 2026-03-25 17:57
Key takeaways
- Most Curvenote users are already part of a Jupyter-like computational community and have existing skills with datasets, computation, and coding.
- A major barrier to beyond-PDF research communication is social and incentive-driven because researchers still need downloadable PDFs to submit to journals and receive credit.
- Continuous Science Foundation emphasizes community and incentive alignment (including earlier attribution and licensing) to reduce fear of being scooped and to increase credit for sharing.
- Improving scientific reproducibility requires both (a) integrity via auditable datasets and pipelines and (b) reuse via others being able to access and apply methods and data.
- There is an evolution in scientific data formats from HDF5 toward Zarr-based, cloud-optimized formats designed for object storage with metadata and more efficient access.
Sections
Curvenote Positioning Adoption Wedge And Architecture
- Most Curvenote users are already part of a Jupyter-like computational community and have existing skills with datasets, computation, and coding.
- Curvenote's early product work focused on lowering technical barriers via a WYSIWYG editor integrated with Jupyter, including copying notebook cell outputs inline into documents.
- JupyterBook, using MyST Markdown, was created to turn notebooks into publishable narratives packaging environment, code, data, and narrative, and it has been used to build tens of thousands of educational texts and courses.
- Curvenote is positioned as a scientific content management system intended to bridge modern research authoring (e.g., notebooks) and legacy publisher workflows (e.g., FTP/XML) that are not designed for data- or compute-rich publishing.
- Curvenote does not store large-scale datasets itself and instead integrates with external repositories or partners for storage and access.
- Curvenote originated in computational geoscience, and current work is focused mainly on computational bioscience and computational neuroscience sharing contexts.
Publication Layer Mismatch And Static Outputs
- A major barrier to beyond-PDF research communication is social and incentive-driven because researchers still need downloadable PDFs to submit to journals and receive credit.
- There is a workflow mismatch between computationally reproducible research practices and the paper-publication process, which often forces static screenshots and weak sharing of code and data.
- Scientific communication systems have not kept pace with the shift to terabyte-scale datasets and complex processing pipelines.
- In some fields (e.g., large imaging datasets), scientists often publish screenshots rather than integrated, zoomable, interrogable views connected to the narrative, limiting verification and exploration.
- Common scientific data-sharing practice is to upload an uncurated zip file to repositories (e.g., Zenodo or Dryad) without sufficient context.
Incentives Credit And Upstream Dissemination
- Continuous Science Foundation emphasizes community and incentive alignment (including earlier attribution and licensing) to reduce fear of being scooped and to increase credit for sharing.
- New journals such as the Journal of Open Source Software emerged to provide career credit for widely reused software labor that was historically undervalued in traditional publication incentives.
- Some journals and societies show resistance to beyond-PDF change because PDFs and current workflows are sufficient for their existing business models, reducing their incentive to invest in new approaches.
- Policy and funding shifts from major foundations are pushing research dissemination upstream toward preprint repositories.
Reproducibility As Integrity Plus Reuse
- Improving scientific reproducibility requires both (a) integrity via auditable datasets and pipelines and (b) reuse via others being able to access and apply methods and data.
- Integrating data, code, and visuals into a single computational narrative (e.g., notebook-style tools) is a key lever for improving reuse and comprehension of scientific results.
Cloud Native Data And Compute To Data
- There is an evolution in scientific data formats from HDF5 toward Zarr-based, cloud-optimized formats designed for object storage with metadata and more efficient access.
- Source Cooperative is presented as an example storage approach built on AWS buckets with a minimal model that supports bringing compute directly to datasets better than many archival systems.
Watchlist
- Rowan Cockett and Tracy Teal are attempting to rally stakeholders around an Open Exchange Architecture standard intended to be as widely adopted in science as the PDF while supporting modular, computational publishing with graceful degradation.
Unknowns
- What measurable impact do executable/interactive articles have on reuse outcomes (e.g., time-to-first-successful-run, verification time, downstream reuse rates) compared with traditional PDF-plus-repository workflows?
- How prevalent are the described failure modes (screenshots for rich datasets, uncurated zip uploads) across fields, and which disciplines experience the largest bottlenecks?
- Which specific publisher workflows and constraints (submission formats, archival requirements, compliance needs) most strongly enforce PDF-centric outputs today?
- What are Curvenote’s real-world integration patterns with external storage and repositories, and what operational limits result (latency, access control, identity, cost allocation)?
- What is the current state of the Open Exchange Architecture effort (spec maturity, governance, reference implementations, and adopters), and what interoperability problems it concretely solves?