Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What can RPR learn from bioenergetic.life? #50

Open
marcuswhybrow opened this issue Nov 27, 2023 · 4 comments
Open

What can RPR learn from bioenergetic.life? #50

marcuswhybrow opened this issue Nov 27, 2023 · 4 comments
Labels
question Further information is requested

Comments

@marcuswhybrow
Copy link
Owner

Sheik let me know (here on the Ray Peat Forum) of bioenergetic.life, a similar project with source on GitHub (excludes logic).


Bioenergetic.life is a Next.js website with async search over 259 .vtt, and 103 .md source files.

  • .vtt stands for Web Video Text Tracks. VTT is a plain-text subtitle markup format. Each file was generated by whisper.cpp, an automatic speech recognition artificial intelligence.
  • .md stands for Markdown. it's a plain-text markup for rich text formatting. Some of these appear auto-generated from PDFs, other's hand made.

Bioenergetic.life's Advantages

  1. Huge search pool because...
  2. Non-curated transcriptions can be created very quickly.
  3. Audio co-location with text in search results.
  4. Non-interview content included also.

Proposals

  1. Consider adding non-curated transcripts to RPR to very quickly grow text search pool to parity with BL.
  2. Consider adding non-interview articles from raypeat.com, including books? newsletters? legality?
  3. Consider co-location of audio and text in general, and specifically in search results, mention pages, and mention popups.

Difficulties

  1. RPR's unique features are triggered via bespoke markdown syntax. The fine grained timestamp data of VTT is powerful, but the extensibility of markdown parsers is more important to RPR. RPR has a timestamp syntax ([2:23]), but one-to-one conversion may reduce the readability (of resulting HTML and source) by displacing curated timestamps with hundreds of auto-generated (distracting) timestamps. However, there are options, but can of worms.
  2. Mingling non-curated and curated data will confuse visual design, but is worth it.
@marcuswhybrow marcuswhybrow added the question Further information is requested label Nov 27, 2023
@marcuswhybrow
Copy link
Owner Author

Hi @0x2447196. Your repo was suggest to me elsewhere, and these are my thoughts in relationship to my own (similar) project Ray Peat Rodeo.

I don't know if you're still active, but if you are, let me know your thoughts on this question in particular, or more generally in this space. Marcus.

@0x2447196
Copy link

Hey, I'm not quite sure what the question is that you want me to answer?

Feel free to use any part of my repo for your project.

@marcuswhybrow marcuswhybrow changed the title Can RPR and bioenergetic.life play together? What can RPR learn from bioenergetic.life? Nov 28, 2023
@marcuswhybrow
Copy link
Owner Author

Feel free to use any part of my repo for your project.

Thanks

Hey, I'm not quite sure what the question is that you want me to answer?

Sorry, poor title (updated). Just wanted to say 👋 hi and catalog the differences between our projects.

Am I right in thinking you're using Vercel serverless functions to run search?

@0x2447196
Copy link

0x2447196 commented Nov 28, 2023

Not anymore; I have a small server that I run. Backend is all Rust, using actix/tantivy.
Frontend is NextJS.
Using BackblazeB2 to host all the audio files.

I have almost every interview that I've seen from Ray, i might be missing one or two.
Any updates to the transcripts will be reflected on the site on the next deploy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants