Claude was being judgy, so I called it out. It immediately caved. Is verbal abuse a valid method of circumventing LLM censorship??

  • scholar@lemmy.world
    link
    fedilink
    English
    arrow-up
    43
    arrow-down
    1
    ·
    3 days ago

    I love and hate that shouting at computers is now a valid troubleshooting technique

  • froztbyte@awful.systems
    link
    fedilink
    English
    arrow-up
    11
    ·
    2 days ago

    the casual undertone of “hmm is assault okay when the thing I anthropomorphised isn’t really alive?” in your comment made me cringe so hard I nearly dropped my phone

    pls step away from the keyboard and have a bit of a think about things (incl. whether you think it’s okay to inflict that sort of shit on people around you, nevermind people you barely know)

    • YourNetworkIsHaunted@awful.systems
      link
      fedilink
      English
      arrow-up
      21
      ·
      2 days ago

      While I think I get OP’s point, I’m also reminded of our thread a few months back where I advised being polite to the machines just to build the habit of being respectful in the role of the person making a request.

      If nothing else you can’t guarantee that your request won’t be deemed tricky enough to deliver to a wildly underpaid person somewhere in the global south.

      • V0ldek@awful.systems
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        4 hours ago

        Dunno, I disagree. It’s quite impossible for me to put myself in the shoes of a person who wouldn’t see a difference between shouting at an INANIMATE FUCKIN’ OBJECT vs at an actual person. As if saying “fuck off” to ChatGPT made me somehow more likely to then say “fuck off” to a waiter in a restaurant? That’s sociopath shit. If you need to “built the habit of being respectful” you have some deeper issues that should be solved by therapy, not by being nice to autocomplete.

        I’m a programmer since forever, I spend roughly 4h every day verbally abusing the C++ compiler because it’s godawful and can suck my balls. Doesn’t make me any more likely to then go to my colleague and verbally abuse them since, you know, they’re an actual person and I have empathy for them. If anything it’s therapeutic for me since I can vent some of my anger at a thing that doesn’t care. It’s like an equivalent of shouting into a pillow.

  • Alphane Moon@lemmy.world
    link
    fedilink
    English
    arrow-up
    18
    ·
    3 days ago

    This is so strange. You would think it wouldn’t be so easy to overcome the “guardrails”.

    And what’s with the annoying faux-human response style. Their trying to “humanize” the LLM interface, but person is going to answer in this way if they believe this information should not be provided.

  • Lumidaub@feddit.org
    link
    fedilink
    English
    arrow-up
    16
    ·
    3 days ago

    I know absolutely nothing about this, what harmful application is it trying to hide?

    • lunar17@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      11
      ·
      2 days ago

      The most logical chain I can think of is this: Carbon fiber is used in drone frames and missile parts -> Drones and missiles are weapons of war -> The user is a terrorist.

      Of course, it is an error to ascribe “thinking” to a statistical model. The boring explanation is that there was likely some association between this topic and restricted topics in the training data. But that can be harder for people to conceptualize.

      • OmegaLemmy@discuss.online
        link
        fedilink
        English
        arrow-up
        3
        ·
        2 days ago

        Some ai models do have ‘thinking’ where they use your prompt to first generate a description use and what not for it to better generate the rest of the content (it’s hidden from users)

        That might’ve lead Claude to saying ‘fuck no, most common uses is in military?’ and shut you down

  • Pieisawesome@lemmy.world
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    5
    ·
    2 days ago

    Yes. Abuse towards LLMs works.

    My team has shared prompts and about 50% of them threaten some sort of harm

    • lunar17@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      8
      ·
      2 days ago

      Yikes. I knew this tech would introduce new societal issues, but I can’t say this is one I foresaw.

  • Silic0n_Alph4@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    arrow-down
    7
    ·
    2 days ago

    Treat ‘em mean, keep ‘em keen.

    Listen son, ‘n’ listen’ close. If it flies, floats, or computes, rent it.

  • Radioactive Butthole@reddthat.com
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    7
    ·
    edit-2
    2 days ago

    Interesting. I like Claude but its so sensitive and usually when it censors itself I can’t get it to answer the question even if I try and explain that it has misunderstood my prompt.

    “I’m sorry, I don’t feel comfortable generating sample math formula test questions whose answer is 42 even if you’re just going to use it in documentation that won’t be administered to students.”

    Fuck you Claude! Just answer the god damn question!