Dnext

September 29, 2024 2:50am

llm-interpolate interpolates between embeddings. As an example of what that means, if you ask it to interpolate between "MyRapSong.wav" and "MyContrySong.wav" (misspelled?) with five intermediate points, it gives you:

"MyRapSong.wav",
"HipHopMeetsCountry.wav",
"SmoothCountryRap.wav",
"CountryVibes.wav",
"MyCountrySong.wav"

It's just interpolating filenames? Not generating the actual songs?

Ok, weird idea.

If you try this, let me know how it goes. It looks like you have to install an llm embed tool for it to work.

#solidstatelife #ai #genai #llms #embeddings

https://github.com/vagos/llm-interpolate

GitHub - vagos/llm-interpolate: Interpolate between embedding points with llm

Interpolate between embedding points with llm. Contribute to vagos/llm-interpolate development by creating an account on GitHub.

ohdeifepha

February 8, 2024 9:30am

Code Representation Learning At Scale | #dev #embeddings #llms

Code Representation Learning At Scale

Recent studies have shown that code language models at scale demonstrate significant performance gains on downstream tasks, i.e., code generation. However, most of the existing works on code representation learning train models at a hundred...

ohdeifepha

February 5, 2024 7:30am

StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis | #vector #graphics #tokens #embeddings #strokenuwa

StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis

To leverage LLMs for visual synthesis, traditional methods convert raster image information into discrete grid tokens through specialized visual modules, while disrupting the model's ability to capture the true semantic representation of visual...

Wayne Radinsky

February 22, 2023 4:37am

"Vector search for the uninitiated". Explaining embeddings and vector search with animals, law enforcement, and cuddliness.

Vector search for the uninitiated

#solidstatelife #ai #nlp #embeddings

Vector Search for the Uninitiated

What is vector search and why all the sudden are we talking about it?

Wayne Radinsky

February 4, 2022 4:08am

OpenAI's latest state-of-the-art models for dense text embeddings are vastly huger and more expensive than previous models but no better and sometimes worse, according to Nils Reimers, an AI researcher at Hugging Face. First I should say a bit about what "dense embeddings" are. First "embeddings" are vectors that capture something of the semantic meaning of words, such that vectors close together represent words with similar meanings and relationships between vectors correlate with relationships between words. Don't worry if calling this an "embedding" makes no sense. Ok, what about the 'dense' part. Well, embeddings can be "sparse" or "dense", where "sparse" means you have thousands of dimensions but most are 0, and "dense" means you have fewer dimensions (say, 400), but most elements are non-zero. Most of the embeddings that you're familiar with are the dense kind: Word2Vec, Fasttext, GloVe, etc.

In his summary he says, "The OpenAI text similarity models perform poorly and much worse than the state of the art."

"The text search models perform quite well, giving good results on several benchmarks. But they are not quite state-of-the-art compared to recent, freely available models."

"The embedding models are slow and expensive: Encoding 10 million documents with the smallest OpenAI model will cost about $80,000. In comparison, using an equally strong open model and running it on cloud will cost as little as $1. Also, operating costs are tremendous: Using the OpenAI models for an application with 1 million monthly queries costs up to $9,000 / month. Open models, which perform better at much lower latencies, cost just $300 / month for the same use-case."

"They generate extremely high-dimensional embeddings, significantly slowing down downstream applications while requiring much more memory."

Usually newer is better and bigger is better, but not always.

OpenAI GPT-3 Text Embeddings -- Really a new state-of-the-art in dense text embeddings?

#solidstatelife #ai #embeddings #nlp #openai

OpenAI GPT-3 Text Embeddings - Really a new state-of-the-art in dense text embeddings?

This week, OpenAI announced an embeddings endpoint (paper) for GPT-3 that allows users to derive dense text embeddings for a given input…

0 Persons are tagged with #embeddings

#embeddings