Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition with active dictionary when using async/await (w/ proposed fix) #190

Open
slessans opened this issue Jan 3, 2024 · 3 comments

Comments

@slessans
Copy link

slessans commented Jan 3, 2024

Describe the bug
SwiftGraphqlClient maintains an active state dictionary, which is concurrently accessed by multiple threads if used via async/await.

Since each async function in swift is dispatched to an executor that the caller cannot control, there is no real way to impose synchronization from outside of the class.

This race condition often manifests as *** Terminating app due to uncaught exception 'NSInvalidArgumentException', reason: '-[NSIndexPath count]: unrecognized selector sent to instance 0x8000000000000000'

To Reproduce
Steps to reproduce the behavior:

Run concurrent requests on the object such as:

async let q1 = client.query(...)
async let q2 = client.query(...)
async let q3 = client.query(...)
try await (a, b, c) = (q1, q2, q3)

Expected behavior
The client itself should either be thread-safe or have a way to enforce the concurrency constraints? For instance, I believe, making it an actor may solve this.

In a real-world, but relatively simple app that I built, I couldn't go more than a few minutes without running into this. I ultimately hacked it by creating my own internal API client class and doing this:

class ApiClient {
    private let _client: SwiftGraphQLClient.Client
    // any serial queue will work here
    private let _clientSyncQueue = DispatchQueue(label: "...", qos: .userInitiated, attributes: [])

   private func query<T, TypeLock>(
        _ selection: Selection<T, TypeLock>,
        as operationName: String? = nil,
        request: URLRequest? = nil,
        policy: SwiftGraphQLClient.Operation.Policy = .cacheFirst
    ) async throws -> DecodedOperationResult<T> where TypeLock: GraphQLHttpOperation {
        return try await withCheckedThrowingContinuation { continuation in
            self._clientSyncQueue.async { [weak self] in
                guard let self = self else { return }
                let publisher = self._client.executeQuery(
                    for: selection, as: operationName, url: request, policy: policy
                )
                .tryMap { result in
                    // NOTE: If there was an error during the execution, we want to raise it before running
                    //       the decoder on the `data` which will most likely fail.
                    if let error = result.error {
                        throw error
                    }
                    return try result.decode(selection: selection)
                }
                .eraseToAnyPublisher()

                var cancellable: AnyCancellable?
                cancellable = publisher.first()
                    .sink { result in
                        switch result {
                        case .finished:
                            break
                        case let .failure(error):
                            continuation.resume(throwing: error)
                        }
                        cancellable?.cancel()
                    } receiveValue: { value in
                        continuation.resume(with: .success(value))
                    }
            }
        }
    }
}

There may be a simpler way, but this was the best way I could ensure that executeQuery (which is the top of the synchronous call stack that leads to the dict access) is never accessed concurrently. I have confirmed the fix works using instrumentation as well as testing.

Notes

If there is something obviously wrong I am doing please let me know.

I think without this, it is very hard to use from async/await swift. I am happy to contribute a fix for this by adding a synchronous queue or a mutex into the client itself. My guess would be mutex is fine since it is such a granular access. Let me know if you agree with the direction and I can create a PR (it would be much cleaner than the hack above)

@slessans slessans changed the title Concurrency Issues with active dictionary Race condition with active dictionary when using async/await (w/ proposed fix) Jan 3, 2024
@slessans
Copy link
Author

slessans commented Jan 3, 2024

Looks like this is related to #98

@nordfogel and @maticzav

@shaps80
Copy link
Collaborator

shaps80 commented Jan 4, 2024

@slessans you're right and tbh its frustrating I've also found that we don't yet have thread safety. I have actually rewritten (as a part of v6) a new foundational layer that brings a custom executor and actor to the library.

This along with a new stack brings modern concurrency to the low level APIs as well as ensures active synchronisation which as you mentioned above, isn't really possible using the current approach.

I can't speak (yet) to the availability of this but I wanted to let you know this IS being worked on as we speak and I'm making great headway with this atm.

I'll update here once I have more information, thanks for your issue and your patience.

@pokryfka
Copy link
Contributor

pokryfka commented Jun 20, 2024

@shaps80 can you make the branch with v6 public?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants