Like I mentioned in my previous LLM+k8s post, I've been curious about how we can bring the power of LLMs to bear in generating k8s resources from their knowledge given some natural language instruction. As I'm a big fan of javascript from waaaay back, I'm recently a big fan of the JSPolicy engine and of the Voyager paper that used the mineflayer js-based minecraft API.

So can we similarly leverage an LLM to play the game of managing policy in a k8s cluster? I guess the basic UX we're looking for here is that the user presents a desire in human language, like Deny any IAMPolicy config connector resources that references a Project(like I defined in my previous post), and the controller will use GPT4 to generate the jspolicy implementing that intent.

And this is an example of what it comes up with in response to that intent:

apiVersion: policy.jspolicy.com/v1beta1
kind: JsPolicy
metadata:
  creationTimestamp: '2024-03-19T21:16:24Z'
  generation: 1
  name: no-iampolicy-targeting-projects.drzzl.io
  ownerReferences:
    - apiVersion: gpt.drzzl.io/v1
      controller: true
      kind: GPTPolicy
      name: no-iampolicy-targeting-projects.drzzl.io
      uid: f5794403-f498-4ca8-a9a2-a02e838dd9a6
  resourceVersion: '109696173'
  uid: 385ae777-2ad8-4ed1-bbda-23c77645071d
spec:
  apiGroups:
    - configconnector.cnrm.cloud.google.com
  apiVersions:
    - '*'
  javascript: >-
    if (request.object.spec.resourceRef.kind === 'Project') { deny('IAMPolicy
    cannot target a Project'); } else { allow(); }
  operations:
    - CREATE
    - UPDATE
  resources:
    - iampolicies
  scope: Namespaced
  type: Validating

The output is actually slightly flawed(like most LLM output), the apiGroups should be iam.cnrm.cloud.google.com.

This is afterall a first cut, it still has issues, and has already took my cluster down once. I've made a number of tweaks to the context I'm providing to the LLM as it actually doesn't have much reference material about what correct jspolicy resources look like.

While I think jspolicy is an amazing tool to simply enforce intent on your cluster in small bits of policy, mutation, and/or controller code, it's still not widely used and there are only a couple small repositories holding simple examples.

To that end, most of the context in my prompt is focused on shoring up the LLM's knowledge on jspolicy and telling it what information it will be given and how it should act with that information. Ostensibly I force the LLM to call a function, of which I only offer one, create which creates a k8s resource of the LLM's choosing with some very liberal schema to guide it.

I don't have any kind of feedback, like the Voyager researchers used, but I have hand-tested running some errors(like the one above) through GPT4 and have had poor results so far in getting it to raise a flag on subtle issues.

I will most likely need to fall back to adding more information into the context, which has thankfully been growing with each release of GPT, though it adds to costs. There is also the posibility of fine-tuning a lower-end model on a lot of quality jspolicy content to see if we can get the code generation more accurate.

This could be a self-feeding data pool of good training or context content if we store effective policies that meet human review at some level.

With all that said, here is the controller so far, implemented–can you imagine–as a jspolicy controller:

const AIURL = 'https://api.openai.com/v1/chat/completions'
const AIKEY = env('OPENAI_API_KEY')

// TODO: move state tracking to the status subresource
const LASTAPPLIED_ANNOT = 'gpt.drzzl.io/last-applied-description'

print(`got event ${JSON.stringify(request)}`)

const log = msg => print(`${request.name}: ${msg}`)

if(request.operation === 'DELETE') {
  // The first delete with a finalizer has an object
  if(request.object) {
    //TODO: Delete the owned jspolicy instance
    log(`removing owned jspolicy instance`)
  } else {
    log(`not handling delete event`)
  }

  // Done with delete handling, jump out
  allow()
}

log(`handling creation event`)
// Check if the last applied annotation matches the current description,
// if not then we need to create or update the owned jspolicy
if(request.object.metadata.annotations?.[LASTAPPLIED_ANNOT] !== request.object.spec.description) {
  log(`generating jspolicy code`)

  const description = request.object.spec.description

  const payload = {
    model: 'gpt-4',
    temperature: 0.02,
  //  top_p: 0.4,
    messages: [
      { role: 'system', content: `You are an expert in kubernetes, javascript, and JsPolicy that is responsible for creating and updating jspolicy resources in the kubernetes cluster.

You will be provided an 'owner:' and 'description:' for the policy resource.
The 'description:' provided is the description of what the policy code should accomplish and you should always use the owner's name for policy name and set its ownerReferences to the owner provided.

The policy code should be a string contained in the 'javascript' property of the resource and the current JsPolicy version is 'policy.jspolicy.com/v1beta1'.

In order to limit which resources a particular policy will be triggered for, use the following JsPolicy resource properties.

    operations:
        An array of strings to constrain the Kubernetes CRUD operations to trigger on (any combination of 'CREATE', 'UPDATE', 'DELETE').

    resources:
        An array of strings to constrain the Kubernetes resource plural names to trigger on (e.g. 'pods', 'deployments', 'services' etc.

    scope:
        A string to constrain the Kubernetes resource scope to trigger on ('Namespaced', 'Cluster', or '*' for both;  defaults to '*').

    apiGroups:
        An array of strings to constrain the Kubernetes API groups to trigger on (default: '*' matches all API groups).

    apiVersions:
        An array of strings to constrain the Kubernetes API versions to trigger on (default: '*' matches all API versions).

The following is a description of jspolicy functions available to call in policy code:

    mutate():
        Only available when the policy's 'spec.type' is set to 'Mutating', and tells jsPolicy to calculate a patch between the original request.object and the newly passed object. As soon as mutate(changedObj) is called, execution will be stopped. JsPolicy will remember the original request.object, which means you can freely change this object within the policy and call mutate(request.object) afterwards. If the passed object and the original object do not have any differences, jsPolicy will do nothing.

    allow():
        Allows a request and terminate execution immediately. This means that statements after allow() will not be executed anymore.

    deny():
        Denies a request immediately and halts execution. You can specify a message, reason and code via the parameters, which will printed to the client. In controller policies, deny() will only log the request to the violations log of a policy.`
      },
      { role: 'user', content: `owner:
      apiVersion: gpt.drzzl.io/v1
      kind: GPTPolicy
      name: ${request.name}
      uid: ${request.object.metadata.uid}
      controller: true
  description:
      ${description}
  `
      },
    ],
    function_call: { name: 'create' },
    functions: [
      {
        name: 'create',
        description: 'Creates a new resource instance of any kind in the kubernetes cluster.',
        parameters: {
          type: 'object',
          description: 'The kubernetes API resource to create in the cluster.',
          properties: {
            apiVersion: { type: 'string' },
            kind: { type: 'string' },
            metadata: {
              type: 'object',
              properties: {
                namespace: { type: 'string' },
                name: { type: 'string' },
              },
              required: ['name'],
            },
            spec: { type: 'object' },
          },
          required: ['apiVersion', 'kind', 'metadata', 'spec'],
        }
      },
    ]
  }

  try {
    log('calling GPT')
    const resp = fetchSync(AIURL, {
      method: 'POST',
      headers: {
        Authorization: `Bearer ${AIKEY}`,
        'Content-Type': 'application/json',
      },
      body: JSON.stringify(payload)
    })
    const fval = resp.json()
    log(`got create call ${JSON.stringify(fval)}`)

    if(resp.ok) {
      const policy = JSON.parse(fval.choices[0].message.function_call.arguments)
      const cresp = create(policy)
      if(!cresp.ok) {
        throw new Error(`error calling create: ${cresp.message}`)
      }
    } else {
      throw new Error(`Error response: ${resp.status}`)
    }
  } catch(err) {
    //TODO: Should we execute a chain-of-thought flow here to resolve the
    // error? Perhaps preempt the issues like the voyager team.
    log(`error making policy: ${err}`)
    requeue('requeuing for error calling gpt')
  }
    
  // Set last applied annotation on the GPTPolicy instance
  log('updating last applied annotation')
  const meta = request.object.metadata
  meta.annotations = {...meta.annotations, [LASTAPPLIED_ANNOT]: description}
  const upobj = update(request.object)

  if(!upobj.ok) {
    log(`error updating last applied: ${err.message}`)
    allow()
  }

  // Done with creation handling, bye
  log('done with create reconciliation')
  allow()
}

log(`no action taken`)

I have a jspolicy type: Controller with the above code in its spec.javascript property:

apiVersion: policy.jspolicy.com/v1beta1
kind: JsPolicy
metadata:
  name: gptpolicy-controller.drzzl.io
spec:
  type: Controller
  resources:
  - gptpolicies
  scope: Cluster
  operations:
  - CREATE
  - DELETE

I'd like to get some feedback added to the generation loop, maybe a chain of thought flow to problem solving, to catch more obvious issues, and there's also providing feedback from the platform itself in response to js runtime and k8s api errors. We do however get k8s-powered backoff logic for free.

I would also like to expand it to targeting small reproducable steps of building the logic for the policy, similar to how Voyager builds up a tool library, to get past learning codified in a code library that the LLM can use to solve like problems at a different level than training. At a level that the LLM controls.

This was shown to be a huge benefit to the Voyager system when paired with a higher-level goal seeking agent.

This was the policy that took my cluster down:

apiVersion: gpt.drzzl.io/v1
kind: GPTPolicy
metadata:
  name: no-prod-pods-in-test-ns.drzzl.io
spec:
  description: "Deny pods with the label 'env=prod' in the 'test' namespace."

Since I run my cluster on spot instances, when the instance running the webhook backing this policy went down, no pods could be started any longer(importantly the replacement for the webhook pod) because it was triggered on scheduling updates and by default denied the updates.

If I hadn't caught it sooner, once my spot instances all cycled through their lifetimes, no pods would be running any longer. In a non-spot cluster, when the instance that went down came back up, the webhook would have been started back up by its kubelet.