Skip to main content

Security Best Practices

Follow these best practices to ensure secure M2M authentication for your autonomous agents.


Credential Management

Never Hard-Code Secrets

Bad Practice:

CLIENT_SECRET = "secret_abc123_THIS_IS_WRONG"

Good Practice:

import os
CLIENT_SECRET = os.getenv('AUTHSEC_CLIENT_SECRET')

Use Secret Management Systems

Recommended Solutions:

  • HashiCorp Vault - Enterprise secret management
  • AWS Secrets Manager - Cloud-native for AWS
  • Azure Key Vault - Cloud-native for Azure
  • Google Secret Manager - Cloud-native for GCP
  • Kubernetes Secrets - Container orchestration

Example with Vault:

import hvac

client = hvac.Client(url='https://vault.example.com')
secret = client.secrets.kv.v2.read_secret_version(path='authsec/client')

CLIENT_ID = secret['data']['data']['client_id']
CLIENT_SECRET = secret['data']['data']['client_secret']

Token Lifecycle

Use Short-Lived Tokens

Recommended Lifetimes:

  • Access Tokens: 1-12 hours
  • JWT SVIDs: 1-24 hours
  • X.509 Certificates: 1-7 days

Shorter lifetimes reduce risk if credentials are compromised.

Implement Token Caching

Don't request a new token for every API call:

import time
from threading import Lock

class TokenCache:
def __init__(self):
self.token = None
self.expires_at = 0
self.lock = Lock()

def get_token(self):
with self.lock:
now = time.time()

# Refresh if token expires in < 5 minutes
if now >= (self.expires_at - 300):
self._refresh_token()

return self.token

def _refresh_token(self):
response = request_new_token()
self.token = response['access_token']
self.expires_at = time.time() + response['expires_in']

Revoke Unused Tokens

Immediately revoke credentials when:

  • Workload is decommissioned
  • Security incident detected
  • Service no longer needs access

Access Control

Principle of Least Privilege

Grant only necessary permissions:

Bad:

{
"scopes": ["*", "admin:all"]
}

Good:

{
"scopes": ["read:data", "write:logs"]
}

Use Role-Based Access Control (RBAC)

Define roles and assign them to workloads:

roles:
ml-trainer:
scopes:
- read:training-data
- write:model-registry

api-gateway:
scopes:
- read:user-profiles
- write:audit-logs

Implement Resource-Level Authorization

Don't just authenticate—authorize specific actions:

def can_access_resource(token, resource_id):
claims = verify_token(token)

# Check if token has required scope
if 'read:data' not in claims['scope']:
return False

# Check if resource belongs to same tenant
if claims['tenant_id'] != get_resource_tenant(resource_id):
return False

return True

Network Security

Use Mutual TLS (mTLS)

Verify both client and server identities:

SPIRE mTLS:

tlsConfig := tlsconfig.MTLSServerConfig(
source,
source,
tlsconfig.AuthorizeID(spiffeid.RequireFromString(
"spiffe://authsec.example.com/production/allowed-client",
)),
)

Encrypt All Communication

Never transmit credentials over HTTP:

  • http://auth.example.com/token
  • https://auth.authsec.ai/token

Restrict Network Access

Use network policies to limit workload communication:

Kubernetes Network Policy:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-spire-agent
spec:
podSelector:
matchLabels:
app: ml-agent
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
name: spire
ports:
- protocol: TCP
port: 8081

Monitoring & Auditing

Log All Authentication Events

Track every token request and usage:

import logging

logger = logging.getLogger('authsec')

def get_access_token():
logger.info("Requesting access token", extra={
'client_id': CLIENT_ID,
'scopes': 'read:data write:logs',
'timestamp': time.time()
})

token = request_token()

logger.info("Access token obtained", extra={
'token_id': extract_token_id(token),
'expires_at': extract_expiry(token)
})

return token

Set Up Alerts

Monitor for suspicious activity:

Alert Triggers:

  • Failed authentication attempts (> 5 in 1 minute)
  • Token requests from unexpected IPs
  • Access to unauthorized resources
  • Certificate rotation failures
  • Expired credentials still being used

Example with Prometheus:

groups:
- name: authsec_alerts
rules:
- alert: HighAuthFailureRate
expr: rate(authsec_auth_failures[5m]) > 0.1
annotations:
summary: "High authentication failure rate detected"

Review Access Logs Regularly

Audit who accessed what resources:

-- Query AuthSec audit logs
SELECT
timestamp,
client_id,
resource,
action,
result
FROM audit_logs
WHERE timestamp > NOW() - INTERVAL '7 days'
AND result = 'denied'
ORDER BY timestamp DESC;

Certificate Management (SPIRE)

Automatic Rotation

Ensure SVIDs rotate before expiration:

SPIRE Agent Config:

agent {
# Rotate certificates when 50% of lifetime remains
svid_store_cache_expiry = "50%"
}

Validate Certificate Chains

Always verify the full trust chain:

import (
"crypto/x509"
"github.com/spiffe/go-spiffe/v2/bundle/x509bundle"
)

func verifyCertificate(cert *x509.Certificate, bundle *x509bundle.Bundle) error {
_, err := cert.Verify(x509.VerifyOptions{
Roots: bundle.X509Authorities(),
CurrentTime: time.Now(),
KeyUsages: []x509.ExtKeyUsage{x509.ExtKeyUsageClientAuth},
})
return err
}

Revoke Compromised Certificates

Immediately revoke if a workload is compromised:

kubectl exec -n spire spire-server-0 -- \
/opt/spire/bin/spire-server entry delete \
-entryID <ENTRY_ID>

Incident Response

Have a Revocation Plan

Steps if credentials are compromised:

  1. Immediate Actions:

    • Revoke compromised credentials in AuthSec
    • Delete SPIRE registration entries
    • Block workload network access
  2. Investigation:

    • Review audit logs for unauthorized access
    • Identify affected resources
    • Determine breach timeline
  3. Remediation:

    • Rotate all potentially affected credentials
    • Apply security patches
    • Update access policies
  4. Prevention:

    • Implement additional monitoring
    • Review and strengthen security practices
    • Conduct post-mortem analysis

Emergency Contact

Keep these contacts readily available:


Compliance & Standards

Follow Industry Standards

NIST Guidelines:

  • Use strong cryptography (RSA 2048+, ECDSA P-256+)
  • Implement multi-factor authentication where possible
  • Maintain audit trails for 1+ years

OWASP Recommendations:

  • Validate all inputs
  • Use secure random number generation
  • Implement rate limiting

Data Residency

Ensure compliance with data regulations:

  • GDPR (Europe)
  • CCPA (California)
  • HIPAA (Healthcare)
  • SOC 2 (Enterprise)

Regular Security Audits

Quarterly Reviews:

  • Active client credentials
  • Granted scopes and permissions
  • Certificate expiration dates
  • Network access policies

Annual Penetration Testing:

  • Engage external security firms
  • Test M2M authentication flows
  • Validate incident response procedures

Common Vulnerabilities

Avoid These Mistakes

Credential Leakage:

  • ✗ Logging full tokens
  • ✗ Committing secrets to Git
  • ✗ Storing in plaintext files

Insufficient Validation:

  • ✗ Not verifying token signatures
  • ✗ Accepting expired tokens
  • ✗ Skipping certificate validation

Overprivileged Access:

  • ✗ Granting admin scopes by default
  • ✗ Reusing credentials across environments
  • ✗ No scope limitations

Security Checklist

Use this checklist before deploying to production:

  • Credentials stored in secret manager (not code)
  • Short-lived tokens (< 24 hours)
  • Token caching implemented
  • Minimal scopes granted (least privilege)
  • mTLS enabled for service communication
  • All traffic encrypted (HTTPS/TLS)
  • Network policies restrict access
  • Authentication/authorization logs enabled
  • Alerts configured for failures
  • Certificate rotation automated
  • Revocation procedure documented
  • Security audit completed
  • Compliance requirements met

Additional Resources

Official Documentation:

Tools:

AuthSec Support:


By following these best practices, you'll ensure your autonomous agents communicate securely with zero-trust authentication! 🔒