Skip to content
This repository was archived by the owner on Apr 29, 2020. It is now read-only.

Refactor of the JavaScript pipeline #35

Merged
merged 3 commits into from
Sep 25, 2018
Merged

Refactor of the JavaScript pipeline #35

merged 3 commits into from
Sep 25, 2018

Conversation

victorb
Copy link
Contributor

@victorb victorb commented Sep 22, 2018

This commit changes the old scriptable pipeline into a declarative
pipeline.

Benefits are:

  • Everything split into stages that can be restarted if needed without
    restarting the entire pipeline
  • Future possibility to cache stashes between builds (not having to run
    npm install on every build if previously successful)
  • Faster reporting of node | browser | webworker CI status since we
    run platforms sequentially with this commit
  • Better support and stability from Jenkins (development efforts are
    focused on declerative pipelines)

Drawbacks:

  • If the entire build itself is fully successful, it'll take longer than before
    as stashing/unstashing adds some build time.

Missing changes:

  • Confirmation from JS-team on the order of tests
    • We might want to run coverage before platform tests etc etc
    • Flaky tests should end up last, so we retry as little of the
      pipeline as possible

Example pipeline:
image
https://ci.ipfs.team/blue/organizations/jenkins/Multiformats%2Fjs-multiaddr/detail/master/74/pipeline

License: MIT
Signed-off-by: Victor Bjelkholm git@victor.earth

@victorb victorb force-pushed the js-pipeline-refactor branch from eeb14f3 to 50cd27a Compare September 22, 2018 13:03
@victorb
Copy link
Contributor Author

victorb commented Sep 22, 2018

@ipfs/javascript-team please let me know what you think of this. It'll make pipelines that succeeds without any failures to be a bit slower but since we can restart each stage if it fails, it gives us a huge benefit for js-ipfs and other flaky projects.

@victorb victorb requested a review from alanshaw September 22, 2018 13:07
@achingbrain
Copy link
Contributor

This looks great, being able to restart individual steps without having to retrigger the whole build will be a massive win.

We might want to run coverage before platform tests etc etc

Running coverage also runs the tests though, so maybe we'd just run coverage as part of the Linux build?

We might have OS-specific code branches though, so a neat improvement (for a later PR) would be to run coverage on each platform, save the generated file as a build artifact of that step then have a post-test step that collates all the platform coverage reports together and publishes stats based on that.

@victorb
Copy link
Contributor Author

victorb commented Sep 22, 2018

Thanks for taking a look @achingbrain!

Running coverage also runs the tests though, so maybe we'd just run coverage as part of the Linux build?

I initially put it together with the "Checks" stage so it runs first. My thought behind putting it in the back is specifically regarding js-ipfs where test:node is already super slow, and together with instrumenting, it gets even slower and blocks progress for the other stages. (Currently investigating if we can proceed with other stages while parallel steps within another stage is running)

Maybe we could run coverage instead of test:node (they should be running the same tests, at least in theory) for each platform (and each nodejs version, not sure we have different code branches for different nodejs versions though?), then "Coverage" step in the end would just concat those reports together and upload them.

@achingbrain
Copy link
Contributor

Maybe we could run coverage instead of test:node (they should be running the same tests, at least in theory) for each platform (and each nodejs version, not sure we have different code branches for different nodejs versions though?), then "Coverage" step in the end would just concat those reports together and upload them.

I think that's what I was suggesting as the 'improvement'. Not sure if you'd be able to just concat the reports together, might need to parse the files instead and do something a clever. Needs a bit of experimenting, though I doubt we're the first people to have this problem so there might be something out there already we can use.

If you ran coverage instead of test:node it would miss any browser-specific codepaths. Not sure how many of those we have, though it's not inconceivable.

@victorb
Copy link
Contributor Author

victorb commented Sep 22, 2018

If you ran coverage instead of test:node it would miss any browser-specific codepaths. Not sure how many of those we have, though it's not inconceivable.

Hm? browser-specific codepaths are exercised in separate steps as we run both test:browser and test:webworker if they exists, for each platform. I'm just mentioning coverage to replace test:node specifically, which should only hit node-specific code-paths (again, in theory).

Needs a bit of experimenting

Yeah, probably. I think up-until now we have only code-coverage on Linux, so at least this change is not making the situation worse. But agree that we can make it better.

@achingbrain
Copy link
Contributor

Hm?

I just mean that if you only instrument the code for running under node, the generated coverage report would not contain any browser-specific code paths since they aren't executed but might be during the test:browser/test:webworker steps run later so it would not show you the whole picture.

Anyway, this is great, we should get it in and worry about improving the coverage stuff later.

@victorb victorb force-pushed the js-pipeline-refactor branch 2 times, most recently from 9c29b0d to b9df0fb Compare September 22, 2018 16:37
@daviddias
Copy link

image

This commit changes the old scriptable pipeline into a declarative
pipeline.

Benefits are:

  - Everything split into stages that can be restarted if needed without
    restarting the entire pipeline
  - Future possibility to cache stashes between builds (not having to run
    `npm install` on every build if previously successful)
  - Faster reporting of `node | browser | webworker` CI status since we
    run platforms sequentially with this commit
  - Better support and stability from Jenkins (development efforts are
    focused on declerative pipelines)

Drawbacks:

  - If the entire build itself is fully successful, it'll take longer than before
    as stashing/unstashing adds some build time.

Missing changes:

  - Confirmation from JS-team on the order of tests
    - We might want to run `coverage` before platform tests etc etc
    - Flaky tests should end up last, so we retry as little of the
      pipeline as possible

License: MIT
Signed-off-by: Victor Bjelkholm <git@victor.earth>
@victorb victorb force-pushed the js-pipeline-refactor branch from b9df0fb to 3fa8e10 Compare September 22, 2018 18:05
@victorb
Copy link
Contributor Author

victorb commented Sep 23, 2018

@diasdavid everything is not run in parallel because of 1) ipfs-inactive/dev-team-enablement#113, I could try it again with the new declarative pipeline that this PR introduces, but chances are it has the same effect 2) if everything is run in parallel, means that when restarting the stage, it'll rerun all the tests for all the platforms when chances are it only failed in one

@hugomrdias
Copy link

Thanks you Victor this looks great!!

  • test:browser includes test:webworker or would be another job?
  • coverage job also runs tests or just gathers the reports and uploads ?

some notes on cov:
this may already be how the pipeline works , if so just ignore the following :)

regarding cov i think we could instrument with nyc only on the linux pipeline (including node and browser) and make the coverage job dependent on the linux pipeline.
linux pipeline would generate reports make then artifacts or whatever jenkins calls this :) and the cov job would only gather the artifacts and upload then to codecov. codecov will take care of the merging https://docs.codecov.io/docs/merging-reports

@alanshaw
Copy link

Hey @victorbjelkholm, thanks so much for pushing this forward. This addresses, in part, most of the issues I rasied in ipfs-inactive/dev-team-enablement#73. I understand that fully parallel isn't possible due to a bug in Jenkins. My only concern is that this'll actually take longer for a single run to complete, but I think the benefit of being able to restart just one section of the pipeline outweighs this right now.

P.S. I'm very happy the browser tests are now not being done for both node versions!

@victorb
Copy link
Contributor Author

victorb commented Sep 24, 2018

test:browser includes test:webworker or would be another job?

It does not. The pipeline automatically checks for the test:node, test:browser and test:webworker scripts, and only runs them if they exist. So projects that wants to run webworker tests needs to have a test:webworker script in their package.json.

coverage job also runs tests or just gathers the reports and uploads ?

Yeah, it runs through the tests while instrumenting and records which code gets run, then uploads the report.

My only concern is that this'll actually take longer for a single run to complete

Yup, that's a tradeoff that we would make in this case, against the ability of rerunning just certain steps. I think currently the bottle-neck is passing the stashed dependencies around the jobs. I tried adding a cache (basically zip the files and upload to jenkins master, and when running tests, download and unzip) but it added more time to the pipeline than just running npm install.

If we spin up a new npm-on-ipfs registry within the Jenkins network, we effectively have a npm cache that would greatly speed up the pipeline. ipfs-inactive/dev-team-enablement#51

@hugomrdias
Copy link

if the cov runs the tests wouldn't be better to setup something like i described in the previous comment?

regarding cov i think we could instrument with nyc only on the linux pipeline (including node and browser) and make the coverage job dependent on the linux pipeline.
linux pipeline would generate reports make them artifacts or whatever jenkins calls this :) and the cov job would only gather the artifacts and upload then to codecov. Codecov will take care of the merging docs.codecov.io/docs/merging-reports

@victorb
Copy link
Contributor Author

victorb commented Sep 24, 2018

@hugomrdias yeah, that makes sense. npm run coverage currently just runs on linux, we could easily run it on windows + macos as well. Bit harder to make it work with browsers I think, that work would have to go into aegir itself, and that would trickier than just cross-platform.

One thing that bothers me with this refactored pipeline is that even though dependencies for linux has been fully downloaded, it doesn't continue until all platforms have done the same. Gonna try to find a solution for this before merging this.

@hugomrdias
Copy link

my point is aegir coverage should only be gather/upload files generated by aegir test --coverage. We would only run tests once.

  • adding nyc to karma should be fairly easy
  • codecov handles merging of multiple reports which is perfect
  • for now running aegir test --coverage on linux should be enough, we can do a simple benchmark to see how much nyc delays the test run if its negligible just run all tests with coverage.

I can help on aegir side to make this happen!!

License: MIT
Signed-off-by: Victor Bjelkholm <git@victor.earth>
@victorb
Copy link
Contributor Author

victorb commented Sep 24, 2018

@hugomrdias yeah, sounds like a good idea, collect coverage when we do the runs, then after we can just upload all the reports. Will add this is a feature request after finishing this pipeline.

Regarding pipeline having to wait for all platforms before continuing, we have two options:

  • Either we make the tests to continue directly after installing the dependencies, but we can't run the webworker + browser + node tests in parallel then, they will have to run sequentially
  • Or, we have it the way we have, and all platforms can start their test at the same time, but the pipeline will wait for all npm install to finish. As mentioned above, if we put npm-on-ipfs within the Jenkins infra, npm install should become very fast so I think this is the way to go for now.

@victorb
Copy link
Contributor Author

victorb commented Sep 24, 2018

In the end, pipeline now looks like this:

image

Since "checks" moved into their own stage and browser+webworker tests now doesn't run per nodejs version but only per OS, it seems it doesn't have too much impact on the speed of each test.

@victorb
Copy link
Contributor Author

victorb commented Sep 24, 2018

Gonna run this pipeline once on every JS project to make sure it works everywhere, then I'll merge it and it'll be used by default.

If you have any objections, now is your chance!

License: MIT
Signed-off-by: Victor Bjelkholm <git@victor.earth>
@hugomrdias
Copy link

Can we still restart jobs inside the tests pipeline? Or we need to re run everything if windows fails? Just an example nothing against windows ^^

@victorb
Copy link
Contributor Author

victorb commented Sep 24, 2018 via email

@victorb victorb merged commit dc4af3a into master Sep 25, 2018
@victorb
Copy link
Contributor Author

victorb commented Sep 25, 2018

Thanks everyone for your feedback!

@victorb victorb deleted the js-pipeline-refactor branch September 25, 2018 08:37
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants