# Rate limits

## Current Rate Limits (Beta Tier)

| Endpoint           | Limit        | Time Window |
| ------------------ | ------------ | ----------- |
| /ask               | 60 requests  | per minute  |
| /legal-summary     | 30 requests  | per minute  |
| /generate-document | 15 requests  | per minute  |
| /status            | 120 requests | per minute  |

{% hint style="danger" %}
&#x20;**Note**: Exceeding these limits will result in a `429 Too Many Requests` response. Please implement retry logic in your application.
{% endhint %}

## Response Headers

To help you manage your usage, each API response includes rate-limit headers:

```makefile
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 12
X-RateLimit-Reset: 1685583492
```

* **X-RateLimit-Limit** — Max requests allowed in the window
* **X-RateLimit-Remaining** — Requests left in the current window
* **X-RateLimit-Reset** — Timestamp (in UNIX format) when the limit resets

## Retry Strategy (Recommended)

When your app hits a rate limit, use **exponential backoff** with jitter to prevent overwhelming the API. Avoid making aggressive retries immediately after failure.

**Example (Pseudocode):**

```python
if response.status_code == 429:
    wait_time = get_retry_after(response)
    sleep(wait_time + random_jitter)
```

## Upgraded Plans

Higher rate limits will be available in the following:

* ✅ **Pro Tier (Coming Soon)** — Higher burst capacity & concurrent sessions
* ✅ **Enterprise Tier** — Custom SLAs and priority support

Join the [waitlist](https://lawyergptai.cc) to request priority access when tiered plans go live.
