Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(http) Improve IsBinary checking performance #539

Merged
merged 2 commits into from
Dec 14, 2022

Conversation

gyx1000
Copy link
Contributor

@gyx1000 gyx1000 commented Dec 14, 2022

Hello,

While browsing the remix project, I noticed that the control of the "Content-Type" header which allows to know if the body is binary could be improved. It is also possible to apply this change in this project.

Indeed, the check on the value of the header is done using an iteration on the array and a test for the presence of string with the Array.includes method.
This implies a complexity of O(TypesCount*IncludesComplexity). It would be more efficient to use Set.has because the complexity would be O(1).

Related remix PR : remix-run/remix#4761

@CLAassistant
Copy link

CLAassistant commented Dec 14, 2022

CLA assistant check
All committers have signed the CLA.

@ryanblock
Copy link
Member

// ./bench.js
let binaryTypes = require('./src/http/helpers/binary-types.js')
let header = 'hi'

console.time('some')
for (let i = 0; i < 1000000; i++) {
  let isBinary = binaryTypes.some(t => header.includes(t))
}
console.timeEnd('some')

console.time('has')
for (let i = 0; i < 1000000; i++) {
  let binaryTypesSet = new Set(binaryTypes)
  let isBinary = binaryTypesSet.has(header)
}
console.timeEnd('has')

// some: 288.125ms
// has: 1.645s

@gyx1000
Copy link
Contributor Author

gyx1000 commented Dec 14, 2022

@ryanblock
You generate the set on each iteration.
Could you try with the set defined outside the for loop ?

@gyx1000
Copy link
Contributor Author

gyx1000 commented Dec 14, 2022

@ryanblock
With the Set defined outside I have this benchmark

some: 150.847ms
has: 6.097ms

@ryanblock
Copy link
Member

AWS Lambda provides a stateless execution model; requests may sometimes get an invocation from a warm Lambda, but when architecting Lambda-based apps a fresh state should generally be assumed upon each invocation. As such, the above code example is the truest representation of the Lambda invocation model.

That said, it is also true that the penalty here is in the conversion of the Array to the Set. This MIME type list is only used in one place, so we can avoid the penalty of that conversion by simply wrapping the src/http/helpers/binary-types-set.js export in a new Set([...; on my machine this requires at nearly the same speed (array: 0.662ms, set: 0.695ms). That seems to check all the efficiency boxes as far as I can tell!

@ryanblock
Copy link
Member

Quick update: try the Array with binaryTypes.includes(header) instead of binaryTypesSet.has(header); this is, for me, even faster:

console.time('some')
for (let i = 0; i < 1000000; i++) {
  let isBinary = binaryTypes.some(t => header.includes(t))
}
console.timeEnd('some')

console.time('includes')
for (let i = 0; i < 1000000; i++) {
  let isBinary = binaryTypes.includes(header)
}
console.timeEnd('includes')

console.time('has')
for (let i = 0; i < 1000000; i++) {
  let isBinary = binaryTypesSet.has(header)
}
console.timeEnd('has')

// some: 282.091ms
// includes: 6.391ms
// has: 8.401ms

@gyx1000
Copy link
Contributor Author

gyx1000 commented Dec 14, 2022

Great, include is faster for me too.
The difference with some is certainly the memory allocation for the arrow function.

I agree with the fact that the change will not bring much with cold lambda function.

@ryanblock ryanblock merged commit 1c93d4f into architect:main Dec 14, 2022
@ryanblock
Copy link
Member

Change published in v5.3.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants