2024-10-24

Toronto AI Safety Meetup
- Understanding and Mitigating Hallucinations in LLMs
- types
  - Intrinsic
    - output conflicts with prompt
    - Factual inconsistency
      - giving wrong facts
    - Factual fabrication
      - saying things that have no basis in reality
  - Extrinsic
    - output can't be verified from source
    - Instruction inconsistency
      - Not following instructions
    - Context inconsistency
      - Output uses facts from input, but misstates facts
    - Logical inconsistency
      - things like doing math wrong
- Problems
  - Imitative falsehoods
    - e.g.
      - making the same mistakes humans do (e.g Thomas Edison created lightbulb), because lots of that is in the training data
      - repeating wrong information from RAG context
  - Duplication bias
    - e.g.
      - repeating things directly from prompt directly
  - Social bias
    - e.g.
      - Assuming "Dr. Kim" is from South Korea
  - Co-occurence bias
    - e.g.
      - "Toronto is the capital of Canada"
        
        Toronto and capital are often together
  - Long-tail knowledge
    - not having enough pre-training for obscure people to write about them
  - Complex reasoning
  - Auto-regressive limitations
    - e.g.
      - arithmetic
        
        learned by example (not rules)
        
        harder for many numbers
        
        need to know carry flag for every digit to generate the first digit
        
        numbers tokenized in an unhelpful way
        
        Claude puts digits in groups of 3 to improve arithmetic handling
  - Inference-related issues
    - temperature
    - Gemini compressed context
      - Causes roblems with in-context learning
      - Evy programming language
        
        https://github.com/evylang/evy
      - Compressed context
        
        Instead of n^2 attention matrix
        
        When processing start of context, makes embedded hidden state, acts like an RNN, tokens are pushed to it
        
        Has smaller short-term context
        
        Good for long-term Q&A, worse for pattern matching
    - Softmax bottlenecks
      - https://aclanthology.org/2022.acl-long.554.pdf
        
        Introduces problem
        
        Describess multi-facet softmax which fixes it
      - Model can only output one token, so if there are two possiblies the embedding between the two might be wrong
      - (relevant to clump )
    - Uncertainty measurment
      - Can we regenerate when model is unsure?