- Toronto AI Safety Meetup
- Understanding and Mitigating Hallucinations in LLMs
- types
- Intrinsic
- output conflicts with prompt
- Factual inconsistency
- Factual fabrication
- saying things that have no basis in reality
- Extrinsic
- output can't be verified from source
- Instruction inconsistency
- Not following instructions
- Context inconsistency
- Output uses facts from input, but misstates facts
- Logical inconsistency
- things like doing math wrong
- Problems
- Imitative falsehoods
- e.g.
- making the same mistakes humans do (e.g Thomas Edison created lightbulb), because lots of that is in the training data
- repeating wrong information from RAG context
- Duplication bias
- e.g.
- repeating things directly from prompt directly
- Social bias
- e.g.
- Assuming "Dr. Kim" is from South Korea
- Co-occurence bias
- e.g.
- "Toronto is the capital of Canada"
- Toronto and capital are often together
- Long-tail knowledge
- not having enough pre-training for obscure people to write about them
- Complex reasoning
- Auto-regressive limitations
- e.g.
- arithmetic
- learned by example (not rules)
- harder for many numbers
- need to know carry flag for every digit to generate the first digit
- numbers tokenized in an unhelpful way
- Claude puts digits in groups of 3 to improve arithmetic handling
- Inference-related issues
- temperature
- Gemini compressed context
- Causes roblems with in-context learning
- Evy programming language
- Compressed context
- Instead of n^2 attention matrix
- When processing start of context, makes embedded hidden state, acts like an RNN, tokens are pushed to it
- Has smaller short-term context
- Good for long-term Q&A, worse for pattern matching
- Softmax bottlenecks
- https://aclanthology.org/2022.acl-long.554.pdf
- Introduces problem
- Describess multi-facet softmax which fixes it
- Model can only output one token, so if there are two possiblies the embedding between the two might be wrong
- (relevant to clump )
- Uncertainty measurment
- Can we regenerate when model is unsure?