WebSocket API Documentation Best Practices: AsyncAPI Guide

Question

What are the best practices for documenting WebSocket endpoints using OpenAPI or AsyncAPI standards? Specifically, how should I:
Describe the connection process, server configurations, and port specifications?
Implement endpoint versioning for WebSocket APIs?
Document streaming input (audio data) and output (text transcription) in chunks?
Define error codes and success responses?
Structure documentation to separate connection establishment from in-connection operations?

I'm particularly concerned about:
How to properly version WebSocket endpoints
Best practices for describing streaming data flows
Whether different documentation approaches are needed for connection establishment versus in-connection operations
Appropriate endpoint path naming conventions for WebSocket APIs

The use case involves creating a new WebSocket endpoint in an existing HTTP service that will:
Accept connections from a frontend
Receive audio streams
Forward audio to a gRPC transcription service
Return text transcription in chunks
Handle normal and error-based connection closures

NeuroAnswers · Accepted Answer

AsyncAPI is the definitive standard for documenting WebSocket endpoints, providing the necessary structure for duplex, message-based communication that OpenAPI cannot adequately support. When documenting your audio transcription WebSocket, you'll need to focus on three-tier documentation structure with proper versioning, binary format for audio streams, and separate handling of connection versus in-connection operations.

Contents
AsyncAPI vs OpenAPI for WebSocket Documentation
Documenting Connection Establishment and Server Configuration
WebSocket Endpoint Versioning Best Practices
Documenting Streaming Audio Input and Text Output
Error Codes and Connection Closure Documentation
Structuring Documentation: Connection vs In-Connection Operations
Complete AsyncAPI Example for Audio Transcription WebSocket
Path Naming Conventions for WebSocket APIs
Sources
Conclusion

AsyncAPI vs OpenAPI for WebSocket Documentation

When considering documentation standards for WebSocket endpoints, it's crucial to understand that OpenAPI, while widely used for REST APIs, has significant limitations for event-driven, real-time communication. As the AsyncAPI documentation team states, "OpenAPI specification won't help you much here" when dealing with WebSocket APIs. This fundamental difference stems from their architectural foundations—OpenAPI is designed around the request-response paradigm, while AsyncAPI embraces event-driven communication patterns.

OpenAPI 3.0 does support WebSocket schemes (ws:// and wss://) in the servers array, but it lacks the ability to properly model the continuous, message-based nature of WebSocket communication. AWS's API Gateway Developer Kit acknowledges this limitation, noting that "TypeSpec and Smithy are the recommended model languages for WebSocket APIs" rather than OpenAPI.

AsyncAPI, by contrast, was specifically created for asynchronous APIs like WebSockets, MQTT, and Kafka. It provides the necessary structure to document connection establishment, message flows, and streaming data patterns. For your audio transcription WebSocket, AsyncAPI offers the tools to document the duplex communication where you receive binary audio streams and return structured text transcription chunks.

The key difference lies in their approach to documentation:
OpenAPI: Models HTTP requests and responses
AsyncAPI: Models channels, operations, and messages over persistent connections

This distinction makes AsyncAPI the clear choice for documenting your WebSocket endpoint that handles continuous audio streaming and transcription results.

Documenting Connection Establishment and Server Configuration

For WebSocket APIs, the connection process is fundamentally different from HTTP connections—it's a persistent, stateful channel rather than stateless request-response cycles. In AsyncAPI, you document the connection establishment in the servers section, specifying the WebSocket endpoint details including protocol, host, port, and any authentication requirements.

Server Configuration Basics

Your AsyncAPI document begins with the servers section, where you define the WebSocket endpoints:

This configuration specifies:
The WebSocket secure protocol (wss://)
The full endpoint URL including version
Optional variables for environment-specific configurations

Authentication Configuration

For production WebSocket endpoints, authentication is typically handled during the handshake phase. AsyncAPI allows you to document various authentication methods:

Channel Bindings for Handshake

The connection establishment itself is documented using channel bindings. In AsyncAPI, channels represent the communication path, and bindings specify how that channel operates at the protocol level:

This documentation ensures that developers understand not just where to connect, but how the WebSocket handshake should occur, including required headers and the upgrade process from HTTP to WebSocket.

WebSocket Endpoint Versioning Best Practices

Versioning WebSocket APIs presents unique challenges compared to REST APIs due to the persistent nature of the connections. Based on industry best practices and the AsyncAPI specification, URI-based versioning is the recommended approach for WebSocket endpoints.

URI Versioning Approach

The most widely adopted pattern is to include the version directly in the WebSocket path:

This approach provides several advantages:
Clear version identification in connection URLs
Easy routing at the load balancer or proxy level
Familiar pattern for developers experienced with REST API versioning
Support for parallel deployment of multiple versions

In your AsyncAPI document, you would specify the versioned server URL:

Version Information in AsyncAPI

Additionally, you should include version information in the AsyncAPI info section:

This dual approach—URI versioning combined with semantic versioning in the documentation—provides both runtime versioning and clear documentation of changes.

Header-Based Versioning Alternative

While URI versioning is preferred, header-based versioning is sometimes used for WebSocket APIs:

This approach allows the same WebSocket endpoint to handle multiple versions based on headers, but it's less common and can complicate routing at the infrastructure level. If you choose this approach, document it clearly in your AsyncAPI specification:

Regardless of your chosen approach, consistency is key. Stick to one versioning strategy across all your WebSocket endpoints and document it clearly in your API documentation.

Documenting Streaming Audio Input and Text Output

One of the most challenging aspects of documenting WebSocket APIs is effectively representing streaming data flows. For your audio transcription use case, this involves documenting both incoming binary audio data and outgoing structured text transcription chunks.

Binary Audio Format Documentation

Audio streaming requires special handling in AsyncAPI because it involves binary data rather than JSON. You should document the audio input using the binary format:

Structured Text Output Documentation

For the text transcription output, you'll want structured JSON that includes not just the text but also metadata like timestamps and completion status:

Handling Multiple Message Types

In real-world scenarios, your WebSocket might need to handle different types of messages. AsyncAPI supports this using the oneOf construct:

Message Correlation

For maintaining context across streaming messages, consider documenting correlation identifiers:

These additional fields help developers track and correlate messages across the streaming session, which is particularly important for real-time applications like audio transcription.

Error Codes and Connection Closure Documentation

Proper error handling documentation is crucial for WebSocket APIs, as it covers both protocol-level connection closures and application-level error messages. For your audio transcription service, you'll need to document both standard WebSocket close codes and custom application error codes.

Standard WebSocket Close Codes

WebSocket has a set of standard close codes defined in RFC 6455. You should document these in your API specification:

| Code | Name | Description |
|------|------|-------------|
| 1000 | Normal Closure | The connection was closed normally. |
| 1001 | Going Away | The endpoint is going away, either because a server is being shut down or a browser is navigating away from the page. |
| 1002 | Protocol Error | An endpoint is terminating the connection due to a protocol error. |
| 1003 | Unsupported Data | An endpoint received a data type it doesn't support. |
| 1005 | No Status Received | A close frame was received without a status code. |
| 1006 | Abnormal Closure | The connection was closed abnormally, occurring when the connection was closed without sending or receiving a close frame. |
| 1012 | Service Restart | The server is restarting. |

Custom Application Error Codes

Beyond standard WebSocket codes, your application should define custom error codes in the 4000-4999 range as recommended by the WebSocket specification. For your transcription service, consider these custom codes:

| Code | Name | Description |
|------|------|-------------|
| 4000 | Invalid Authentication | Authentication failed or credentials are invalid. |
| 4001 | Invalid Session Configuration | The session configuration is invalid or missing required parameters. |
| 4002 | Invalid Model | Specified transcription model is invalid or not available. |
| 4003 | Unsupported Audio Format | The audio format is not supported. |
| 4004 | Audio Processing Error | Error occurred while processing the audio. |
| 4005 | Insufficient Quota | API quota exceeded for the current period. |
| 4029 | Rate Limited | Too many requests in a short period. |
| 4500 | Internal Server Error | An unexpected error occurred on the server. |

Error Message Format

For application-level errors (not just connection closure), define a structured error message format:

Connection Closure Documentation

In your AsyncAPI document, document how connection closure should be handled:

This comprehensive error documentation ensures that both client and server developers understand the full error lifecycle, from connection issues to application-specific error handling.

Structuring Documentation: Connection vs In-Connection Operations

One of the most powerful aspects of AsyncAPI is its ability to clearly separate different aspects of WebSocket communication. For your audio transcription service, this means distinguishing between connection establishment (the handshake) and ongoing in-connection operations (sending audio, receiving transcriptions).

Three-Tier Documentation Structure

AsyncAPI provides a three-tier binding structure that perfectly addresses your concern about separating documentation:
Channel Bindings - Document the connection establishment and handshake
Operation Bindings - Document the in-connection operations
Message Bindings - Document the payload format for each message

Channel Bindings for Connection Establishment

Channel bindings document how the connection itself is established. This is where you specify the WebSocket handshake process:

Operation Bindings for In-Connection Operations

Once the connection is established, you document the operations that can occur over that connection:

Message Bindings for Payload Schemas

Finally, message bindings define the exact structure of payloads:

Complete Separation Example

Putting it all together, here's how you would document the separation between connection and operations:

This clear separation makes it easy for developers to understand both how to establish the connection and what operations are available once connected. It also allows you to generate different types of documentation—connection guides and operation references—from the same AsyncAPI specification.

Complete AsyncAPI Example for Audio Transcription WebSocket

Here's a complete AsyncAPI specification for your audio transcription WebSocket endpoint, incorporating all the best practices discussed:

This comprehensive AsyncAPI specification documents:
Connection establishment process with authentication
Binary audio stream format and requirements
Structured transcription results with metadata
Error handling with custom error codes
Graceful connection closure
Version information and security requirements

You can use this specification with AsyncAPI tools to generate interactive documentation, client SDKs, and server stubs, ensuring consistency across your implementation.

Path Naming Conventions for WebSocket APIs

Choosing appropriate path naming conventions for WebSocket endpoints is an important aspect of API design that affects both discoverability and consistency across your API surface. Based on community consensus and industry practices, here are the recommended approaches.

WebSocket Path Prefix

The most widely adopted convention is to use a /ws/ prefix for WebSocket endpoints to distinguish them from REST endpoints:

This clear separation makes it easy for developers to identify which endpoints are WebSocket connections without examining the protocol or documentation. The /ws/ prefix is intuitive and consistent across many organizations.

Semantic Naming

Beyond the prefix, use semantic naming that clearly describes the functionality:

These names clearly indicate both the protocol type (WebSocket) and the specific functionality.

Versioning in the Path

As discussed earlier, include the API version in the path:

Place the version after the domain but before the semantic endpoint name for consistency with your REST API versioning strategy.

Consistent Base Paths

Maintain consistent base paths across all your WebSocket endpoints:

This consistency makes it easier for developers to understand your API structure and navigate between different endpoints.

Alternative Naming Patterns

While /ws/ is the most common prefix, some organizations use different conventions:

| Pattern | Example | When to Use |
|---------|---------|-------------|
| /api/ws/ | wss://api.example.com/api/v1/ws/transcribe | When your base path already includes /api/ |
| /realtime/ | wss://api.example.com/v1/realtime/transcribe | When emphasizing real-time nature |
| /stream/ | wss://api.example.com/v1/stream/transcribe | When emphasizing streaming nature |

Choose a pattern that aligns with your existing API naming conventions and clearly communicates the WebSocket nature of the endpoint.

Full Example

Here's how your complete WebSocket endpoint URL would look with recommended naming conventions:

Breaking this down:
wss:// - Secure WebSocket protocol
api.example.com - Your domain
/v1/ - API version
/ws/ - WebSocket endpoint prefix
audio-transcribe - Semantic functionality description

This structure provides a clear, consistent, and discoverable URL pattern for your WebSocket API.

Sources
AsyncAPI Blog Part 1 — Why OpenAPI specification won't help with WebSocket APIs: https://asyncapi.com/blog/websocket-api-documentation-openapi-asyncapi-pt1/
AsyncAPI Blog Part 2 — WebSocket API documentation best practices: https://asyncapi.com/blog/websocket-api-documentation-openapi-asyncapi-pt2/
AsyncAPI WebSocket Bindings — Three-tier binding structure for channels, operations, and messages: https://www.asyncapi.com/docs/reference/specification/v3.0.0#channelBindings
AWS PDK Documentation — TypeSpec and Smithy as recommended model languages for WebSocket APIs: https://docs.aws.amazon.com/apigateway/latest/developerguide/http-api-develop-model.html
Prosa AI STT Documentation — Real-world error codes example for transcription services: https://docs.prosa.ai/errors-and-status-codes/
Swagger/OpenAPI Docs — WebSocket scheme support in OpenAPI 3.0: https://swagger.io/docs/specification/v3.0/serialization/
Stack Overflow Naming Convention - /api/path for HTTP, /ws/path for WebSocket convention: https://stackoverflow.com/questions/42782766/whats-the-correct-uri-for-a-websocket-endpoint
AsyncAPI 3.0.0 Release Notes — Modern specification features including channel as TCP connection: https://www.asyncapi.com/blog/asyncapi-3-0-0-is-out/
Bump.sh Comparison — AsyncAPI vs OpenAPI for different API paradigms: https://bump.sh/blog/openapi-vs-asyncapi

Conclusion

Documenting WebSocket APIs effectively requires a different approach than traditional REST APIs, with AsyncAPI emerging as the definitive standard for this purpose. Based on best practices and industry consensus, the five key recommendations for your audio transcription WebSocket are:
Use AsyncAPI, not OpenAPI - AsyncAPI provides the necessary structure to document duplex, message-based communication that OpenAPI cannot adequately model.
Implement URI versioning - Follow the pattern wss://api.example.com/v1/ws/transcribe to clearly indicate API versions and enable proper routing.
Document streaming data flows separately - Use AsyncAPI's binary format for audio input and structured JSON for text output, with clear documentation of format requirements.
Define comprehensive error handling - Document both standard WebSocket close codes and custom application error codes in the 4000-4999 range, along with structured error message formats.
Separate connection from operations - Utilize AsyncAPI's three-tier binding structure to clearly distinguish between connection establishment (channel bindings) and in-connection operations (operation bindings).

By following these practices, you'll create clear, comprehensive documentation that enables developers to effectively implement and integrate with your WebSocket API. The AsyncAPI specification not only serves as documentation but can also be used to generate client SDKs, server stubs, and interactive documentation, ensuring consistency across your implementation.

Pattern	Example	When to Use
`/api/ws/`	`wss://api.example.com/api/v1/ws/transcribe`	When your base path already includes `/api/`
`/realtime/`	`wss://api.example.com/v1/realtime/transcribe`	When emphasizing real-time nature
`/stream/`	`wss://api.example.com/v1/stream/transcribe`	When emphasizing streaming nature

Code	Name	Description
1000	Normal Closure	The connection was closed normally.
1001	Going Away	The endpoint is going away, either because a server is being shut down or a browser is navigating away from the page.
1002	Protocol Error	An endpoint is terminating the connection due to a protocol error.
1003	Unsupported Data	An endpoint received a data type it doesn’t support.
1005	No Status Received	A close frame was received without a status code.
1006	Abnormal Closure	The connection was closed abnormally, occurring when the connection was closed without sending or receiving a close frame.
1012	Service Restart	The server is restarting.

Code	Name	Description
4000	Invalid Authentication	Authentication failed or credentials are invalid.
4001	Invalid Session Configuration	The session configuration is invalid or missing required parameters.
4002	Invalid Model	Specified transcription model is invalid or not available.
4003	Unsupported Audio Format	The audio format is not supported.
4004	Audio Processing Error	Error occurred while processing the audio.
4005	Insufficient Quota	API quota exceeded for the current period.
4029	Rate Limited	Too many requests in a short period.
4500	Internal Server Error	An unexpected error occurred on the server.