Refactor of the JavaScript pipeline #35

victorb · 2018-09-22T13:02:43Z

This commit changes the old scriptable pipeline into a declarative
pipeline.

Benefits are:

Everything split into stages that can be restarted if needed without
restarting the entire pipeline
Future possibility to cache stashes between builds (not having to run
npm install on every build if previously successful)
Faster reporting of node | browser | webworker CI status since we
run platforms sequentially with this commit
Better support and stability from Jenkins (development efforts are
focused on declerative pipelines)

Drawbacks:

If the entire build itself is fully successful, it'll take longer than before
as stashing/unstashing adds some build time.

Missing changes:

Confirmation from JS-team on the order of tests
- We might want to run coverage before platform tests etc etc
- Flaky tests should end up last, so we retry as little of the
  pipeline as possible

Example pipeline:

https://ci.ipfs.team/blue/organizations/jenkins/Multiformats%2Fjs-multiaddr/detail/master/74/pipeline

License: MIT
Signed-off-by: Victor Bjelkholm git@victor.earth

victorb · 2018-09-22T13:06:43Z

@ipfs/javascript-team please let me know what you think of this. It'll make pipelines that succeeds without any failures to be a bit slower but since we can restart each stage if it fails, it gives us a huge benefit for js-ipfs and other flaky projects.

achingbrain · 2018-09-22T13:18:12Z

This looks great, being able to restart individual steps without having to retrigger the whole build will be a massive win.

We might want to run coverage before platform tests etc etc

Running coverage also runs the tests though, so maybe we'd just run coverage as part of the Linux build?

We might have OS-specific code branches though, so a neat improvement (for a later PR) would be to run coverage on each platform, save the generated file as a build artifact of that step then have a post-test step that collates all the platform coverage reports together and publishes stats based on that.

victorb · 2018-09-22T13:22:10Z

Thanks for taking a look @achingbrain!

Running coverage also runs the tests though, so maybe we'd just run coverage as part of the Linux build?

I initially put it together with the "Checks" stage so it runs first. My thought behind putting it in the back is specifically regarding js-ipfs where test:node is already super slow, and together with instrumenting, it gets even slower and blocks progress for the other stages. (Currently investigating if we can proceed with other stages while parallel steps within another stage is running)

Maybe we could run coverage instead of test:node (they should be running the same tests, at least in theory) for each platform (and each nodejs version, not sure we have different code branches for different nodejs versions though?), then "Coverage" step in the end would just concat those reports together and upload them.

achingbrain · 2018-09-22T13:26:09Z

Maybe we could run coverage instead of test:node (they should be running the same tests, at least in theory) for each platform (and each nodejs version, not sure we have different code branches for different nodejs versions though?), then "Coverage" step in the end would just concat those reports together and upload them.

I think that's what I was suggesting as the 'improvement'. Not sure if you'd be able to just concat the reports together, might need to parse the files instead and do something a clever. Needs a bit of experimenting, though I doubt we're the first people to have this problem so there might be something out there already we can use.

If you ran coverage instead of test:node it would miss any browser-specific codepaths. Not sure how many of those we have, though it's not inconceivable.

victorb · 2018-09-22T13:29:19Z

If you ran coverage instead of test:node it would miss any browser-specific codepaths. Not sure how many of those we have, though it's not inconceivable.

Hm? browser-specific codepaths are exercised in separate steps as we run both test:browser and test:webworker if they exists, for each platform. I'm just mentioning coverage to replace test:node specifically, which should only hit node-specific code-paths (again, in theory).

Needs a bit of experimenting

Yeah, probably. I think up-until now we have only code-coverage on Linux, so at least this change is not making the situation worse. But agree that we can make it better.

achingbrain · 2018-09-22T13:34:08Z

Hm?

I just mean that if you only instrument the code for running under node, the generated coverage report would not contain any browser-specific code paths since they aren't executed but might be during the test:browser/test:webworker steps run later so it would not show you the whole picture.

Anyway, this is great, we should get it in and worry about improving the coverage stuff later.

daviddias · 2018-09-22T17:22:02Z

This commit changes the old scriptable pipeline into a declarative pipeline. Benefits are: - Everything split into stages that can be restarted if needed without restarting the entire pipeline - Future possibility to cache stashes between builds (not having to run `npm install` on every build if previously successful) - Faster reporting of `node | browser | webworker` CI status since we run platforms sequentially with this commit - Better support and stability from Jenkins (development efforts are focused on declerative pipelines) Drawbacks: - If the entire build itself is fully successful, it'll take longer than before as stashing/unstashing adds some build time. Missing changes: - Confirmation from JS-team on the order of tests - We might want to run `coverage` before platform tests etc etc - Flaky tests should end up last, so we retry as little of the pipeline as possible License: MIT Signed-off-by: Victor Bjelkholm <git@victor.earth>

victorb · 2018-09-23T08:33:04Z

@diasdavid everything is not run in parallel because of 1) ipfs-inactive/dev-team-enablement#113, I could try it again with the new declarative pipeline that this PR introduces, but chances are it has the same effect 2) if everything is run in parallel, means that when restarting the stage, it'll rerun all the tests for all the platforms when chances are it only failed in one

hugomrdias · 2018-09-24T09:53:52Z

Thanks you Victor this looks great!!

test:browser includes test:webworker or would be another job?
coverage job also runs tests or just gathers the reports and uploads ?

some notes on cov:
this may already be how the pipeline works , if so just ignore the following :)

regarding cov i think we could instrument with nyc only on the linux pipeline (including node and browser) and make the coverage job dependent on the linux pipeline.
linux pipeline would generate reports make then artifacts or whatever jenkins calls this :) and the cov job would only gather the artifacts and upload then to codecov. codecov will take care of the merging https://docs.codecov.io/docs/merging-reports

alanshaw · 2018-09-24T13:38:37Z

Hey @victorbjelkholm, thanks so much for pushing this forward. This addresses, in part, most of the issues I rasied in ipfs-inactive/dev-team-enablement#73. I understand that fully parallel isn't possible due to a bug in Jenkins. My only concern is that this'll actually take longer for a single run to complete, but I think the benefit of being able to restart just one section of the pipeline outweighs this right now.

P.S. I'm very happy the browser tests are now not being done for both node versions!

victorb · 2018-09-24T13:52:48Z

test:browser includes test:webworker or would be another job?

It does not. The pipeline automatically checks for the test:node, test:browser and test:webworker scripts, and only runs them if they exist. So projects that wants to run webworker tests needs to have a test:webworker script in their package.json.

coverage job also runs tests or just gathers the reports and uploads ?

Yeah, it runs through the tests while instrumenting and records which code gets run, then uploads the report.

My only concern is that this'll actually take longer for a single run to complete

Yup, that's a tradeoff that we would make in this case, against the ability of rerunning just certain steps. I think currently the bottle-neck is passing the stashed dependencies around the jobs. I tried adding a cache (basically zip the files and upload to jenkins master, and when running tests, download and unzip) but it added more time to the pipeline than just running npm install.

If we spin up a new npm-on-ipfs registry within the Jenkins network, we effectively have a npm cache that would greatly speed up the pipeline. ipfs-inactive/dev-team-enablement#51

hugomrdias · 2018-09-24T14:01:26Z

if the cov runs the tests wouldn't be better to setup something like i described in the previous comment?

regarding cov i think we could instrument with nyc only on the linux pipeline (including node and browser) and make the coverage job dependent on the linux pipeline.
linux pipeline would generate reports make them artifacts or whatever jenkins calls this :) and the cov job would only gather the artifacts and upload then to codecov. Codecov will take care of the merging docs.codecov.io/docs/merging-reports

victorb · 2018-09-24T14:09:19Z

@hugomrdias yeah, that makes sense. npm run coverage currently just runs on linux, we could easily run it on windows + macos as well. Bit harder to make it work with browsers I think, that work would have to go into aegir itself, and that would trickier than just cross-platform.

One thing that bothers me with this refactored pipeline is that even though dependencies for linux has been fully downloaded, it doesn't continue until all platforms have done the same. Gonna try to find a solution for this before merging this.

hugomrdias · 2018-09-24T14:22:03Z

my point is aegir coverage should only be gather/upload files generated by aegir test --coverage. We would only run tests once.

adding nyc to karma should be fairly easy
codecov handles merging of multiple reports which is perfect
for now running aegir test --coverage on linux should be enough, we can do a simple benchmark to see how much nyc delays the test run if its negligible just run all tests with coverage.

I can help on aegir side to make this happen!!

License: MIT Signed-off-by: Victor Bjelkholm <git@victor.earth>

victorb · 2018-09-24T15:56:26Z

@hugomrdias yeah, sounds like a good idea, collect coverage when we do the runs, then after we can just upload all the reports. Will add this is a feature request after finishing this pipeline.

Regarding pipeline having to wait for all platforms before continuing, we have two options:

Either we make the tests to continue directly after installing the dependencies, but we can't run the webworker + browser + node tests in parallel then, they will have to run sequentially
Or, we have it the way we have, and all platforms can start their test at the same time, but the pipeline will wait for all npm install to finish. As mentioned above, if we put npm-on-ipfs within the Jenkins infra, npm install should become very fast so I think this is the way to go for now.

victorb · 2018-09-24T17:33:37Z

In the end, pipeline now looks like this:

Since "checks" moved into their own stage and browser+webworker tests now doesn't run per nodejs version but only per OS, it seems it doesn't have too much impact on the speed of each test.

victorb · 2018-09-24T17:36:06Z

Gonna run this pipeline once on every JS project to make sure it works everywhere, then I'll merge it and it'll be used by default.

If you have any objections, now is your chance!

License: MIT Signed-off-by: Victor Bjelkholm <git@victor.earth>

hugomrdias · 2018-09-24T20:22:04Z

Can we still restart jobs inside the tests pipeline? Or we need to re run everything if windows fails? Just an example nothing against windows ^^

victorb · 2018-09-24T20:26:22Z

Unfortunately we cannot restart individual parallel steps (and we haven’t been able to before either), so if windows browser tests fails, all the tests would have to be rerun. The alternative is to split up the oses in different stages (but the browser, web worker, node tests would still be parallel) but then they’ll run sequentially instead of parallel. There is a issue in the Jenkins issue tracker about that, so as fast as that functionality exists, we’ll deploy it :)

…

On Mon, 24 Sep 2018 at 22:22, Hugo Dias ***@***.***> wrote: Can we still restart jobs inside the tests pipeline? Or we need to re run everything if windows fails? Just an example nothing against windows ^^ — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#35 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAcD9OOsSaEm3VZdGJDt19pfz34bxnMlks5ueT7tgaJpZM4W1Qtv> .

victorb · 2018-09-25T08:37:50Z

Thanks everyone for your feedback!

victorb force-pushed the js-pipeline-refactor branch from eeb14f3 to 50cd27a Compare September 22, 2018 13:03

victorb requested a review from alanshaw September 22, 2018 13:07

victorb force-pushed the js-pipeline-refactor branch 2 times, most recently from 9c29b0d to b9df0fb Compare September 22, 2018 16:37

victorb force-pushed the js-pipeline-refactor branch from b9df0fb to 3fa8e10 Compare September 22, 2018 18:05

Less stages and stash/unstash

c8989ac

License: MIT Signed-off-by: Victor Bjelkholm <git@victor.earth>

Only run lint and coverage if those scripts exists

788413e

License: MIT Signed-off-by: Victor Bjelkholm <git@victor.earth>

victorb merged commit dc4af3a into master Sep 25, 2018

victorb deleted the js-pipeline-refactor branch September 25, 2018 08:37

This was referenced Sep 25, 2018

CI pipeline for js-ipfs and js-ipfs-api ipfs-inactive/dev-team-enablement#73

Closed

Parallize aegir scripts ipfs-inactive/dev-team-enablement#102

Closed

achingbrain mentioned this pull request Oct 2, 2018

Should we have steps per OS on Jenkins ipfs-inactive/dev-team-enablement#146

Closed

Refactor of the JavaScript pipeline #35

Refactor of the JavaScript pipeline #35

Uh oh!

Conversation

victorb commented Sep 22, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

victorb commented Sep 22, 2018

Uh oh!

achingbrain commented Sep 22, 2018

Uh oh!

victorb commented Sep 22, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

achingbrain commented Sep 22, 2018

Uh oh!

victorb commented Sep 22, 2018

Uh oh!

achingbrain commented Sep 22, 2018

Uh oh!

daviddias commented Sep 22, 2018

Uh oh!

victorb commented Sep 23, 2018

Uh oh!

hugomrdias commented Sep 24, 2018

Uh oh!

alanshaw commented Sep 24, 2018

Uh oh!

victorb commented Sep 24, 2018

Uh oh!

hugomrdias commented Sep 24, 2018

Uh oh!

victorb commented Sep 24, 2018

Uh oh!

hugomrdias commented Sep 24, 2018

Uh oh!

victorb commented Sep 24, 2018

Uh oh!

victorb commented Sep 24, 2018

Uh oh!

victorb commented Sep 24, 2018

Uh oh!

hugomrdias commented Sep 24, 2018

Uh oh!

victorb commented Sep 24, 2018 via email

Uh oh!

victorb commented Sep 25, 2018

Uh oh!

Uh oh!

victorb commented Sep 22, 2018 •

edited

Loading

victorb commented Sep 22, 2018 •

edited

Loading