Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create multi-stage dockerfile #734

Merged
merged 6 commits into from
Nov 6, 2023

Conversation

AlMaVizca
Copy link
Contributor

@AlMaVizca AlMaVizca commented Oct 11, 2023

  • Creating files for PHP configurations to improve legibility

I would also like to refactor the build-farmOS.sh script to:

  • Install composer packages in another stage to only copy the vendor folder
  • Remove git and zip packages from the image, using something similar to PHP_GEOS

Please let me know if it would be ok to add all that on this PR, or you prefer to create another issue for it.

Edited according to this comment

docker/Dockerfile Outdated Show resolved Hide resolved
@AlMaVizca AlMaVizca marked this pull request as ready for review October 11, 2023 11:40
Copy link
Member

@mstenta mstenta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this @AlMaVizca! It's funny because just the other day I was thinking about refactoring this to use a multi-stage build.

Can we split this up into multiple atomic commits to make it easier to review and to keep each "intention" separate? I like to try to keep the changes in each commit very focused (even to the point of separating "whitespace" changes out to their own commits).

It seems like the primary changes are:

  • Moving PHP configurations to separate files to improve legibility
  • Reduce image size with multi-stage build process
  • Run container as www-data user

I would also like to refactor the build-farmOS.sh script to:

  • Install composer packages in another stage to only copy the vendor folder
  • Remove git and zip packages from the image, using something similar to PHP_GEOS

Please let me know if it would be ok to add all that on this PR, or you prefer to create another issue for it.

I'm curious what this would look like, but maybe we should keep it separate for now.

I also need to think a little deeper about how all of these changes may affect images that extend this base image.

And lastly, right now the focus is on docker/Dockerfile, but we also have docker/dev/Dockerfile, which may also need to be considered to make sure nothing breaks.

Thanks again for diving in and proposing next steps!

docker/Dockerfile Outdated Show resolved Hide resolved
@AlMaVizca
Copy link
Contributor Author

AlMaVizca commented Oct 12, 2023

It's my pleasure to cooperate, also it's my way to understand the system and see if I can use it :)

I'm curious what this would look like, but maybe we should keep it separate for now.

I also need to think a little deeper about how all of these changes may affect images that extend this base image.

And lastly, right now the focus is on docker/Dockerfile, but we also have docker/dev/Dockerfile, which may also need to be considered to make sure nothing breaks.

About these, I see them interconnected. Thanks for pointing out the dev version, I didn't notice it earlier.
In my ideal, I would have a multi-stage Dockerfile like:

From drupal:<version> as dependencies   (one for geos another one for compose)
....
FROM drupal:<version> as dev (here all the dependencies will be integrated and extra configurations added)
....
FROM drupal:<version> (Here  it would be the final copy, mainly from dev with all the sources and dependencies, without the specific dev code)

With this template of Dockerfile and without parameters on the build, the prod version image will be created, and using the following on docker-compose.development.yaml will stop the build at that point.

version: '3'
services:
  ....
  www:
    depends_on:
      - db
    build:
      context: ./
      target: dev

If you consider it, I can handle those changes, not sure what are the other image dependencies, but I'm willing to collaborate to get them working

Copy link
Member

@mstenta mstenta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @AlMaVizca! I reviewed in more detail and left some questions and change requests. Most are small nitpick changes. Overall I think this is on the right track!

I'd like to get @Paul12's1 and @symbioquine's thoughts on this before we merge it too.

docker/Dockerfile Outdated Show resolved Hide resolved
docker/conf.d/realpath_cache-recommended.ini Outdated Show resolved Hide resolved
docker/conf.d/farmOS-recommended.ini Outdated Show resolved Hide resolved
docker/Dockerfile Outdated Show resolved Hide resolved
docker/Dockerfile Show resolved Hide resolved
docker/Dockerfile Outdated Show resolved Hide resolved
docker/Dockerfile Outdated Show resolved Hide resolved
docker/Dockerfile Show resolved Hide resolved
docker/Dockerfile Outdated Show resolved Hide resolved
@AlMaVizca AlMaVizca force-pushed the docker-image-improvements branch 2 times, most recently from 94a52b1 to 9d1bb34 Compare October 13, 2023 07:41
docker/Dockerfile Outdated Show resolved Hide resolved
docker/Dockerfile Show resolved Hide resolved
docker/Dockerfile Outdated Show resolved Hide resolved
docker/Dockerfile Outdated Show resolved Hide resolved
docker/Dockerfile Show resolved Hide resolved
docker/Dockerfile Show resolved Hide resolved
Copy link
Member

@mstenta mstenta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much @AlMaVizca! I just noted a few tiny nits. I will trigger the workflow run so we can see if this works, or if it breaks anything downstream of it, like the 2.x-dev image.

docker/Dockerfile Outdated Show resolved Hide resolved
docker/Dockerfile Outdated Show resolved Hide resolved
docker/Dockerfile Outdated Show resolved Hide resolved
@AlMaVizca
Copy link
Contributor Author

My pleasure.
I saw the workflow failing in two not related tasks:

  • No added records on the changelog
  • Failing to build dev image.

Do I need to change anything there?

@mstenta
Copy link
Member

mstenta commented Oct 13, 2023

Thanks @AlMaVizca! I think this is looking good!

I saw the workflow failing in two not related tasks:

  • No added records on the changelog
  • Failing to build dev image.

Do I need to change anything there?

Don't worry about the changlog failure... we have that to remind us to add notes about new features, bug fixes, etc - but I generally don't include internal dev tooling changes like this.

We do need to fix the dev image build, however. That is a core part of our deliver.yml workflow, because it is used to run our tests, static code analysis, etc.

It is based on the farmos/farmos:2.x image, so we probably just need to make some adjustments based on the changes in this PR.

It looks like the first thing it failed on was the change from opcache-recommended.ini to opcache-recommended.overrides.ini:

#5 [ 2/12] RUN sed -i 's|opcache.revalidate_freq=60|opcache.revalidate_freq=0|g' /usr/local/etc/php/conf.d/opcache-recommended.ini
#5 0.266 sed: couldn't open temporary file /usr/local/etc/php/conf.d/sedHHw5A9: Permission denied
#5 ERROR: process "/bin/sh -c sed -i 's|opcache.revalidate_freq=60|opcache.revalidate_freq=0|g' /usr/local/etc/php/conf.d/opcache-recommended.ini" did not complete successfully: exit code: 4

https://github.com/farmOS/farmOS/actions/runs/6505581983/job/17679156363

There may be other failures after that too, but that's what caused the test to abort.

We do a funny thing with the 2.x-dev image regarding the www-data user, to make local development easier, which may need to be considered. We override the user ID of the www-data user inside the image so that it matches the user ID of the host machine user. On Linux this is typically uid 1000, but we supply an ARG for end-users/developers to build with a different uid if they need something different. The reason we do this is so that all of the files in /opt/drupal/ are owned by www-data inside the container, and the host user outside the container, so that they can load the entire codebase into their IDE and do development without having to mess with permissions.

# Change the user/group IDs of www-data inside the image to match the ID of the

You'll notice that in the 2.x-dev image build we actually change to the www-data user before running build-farmOS.sh (as well as creating a few debug config files), but then change back to root user at the end. Perhaps there's a better way to meet all these needs... 🤔 Curious if you have any thoughts! But if you want to just make it work for now so we can merge this as a first step that's fine too.

@AlMaVizca
Copy link
Contributor Author

Thanks, @mstenta.
I think that it's best to have only one Dockerfile with an intermediate step for the dev version and having production as a final image. From what you explained, and looking at the workflows, it will require a change there too. So to not make this a big PR my proposal is to fix the dev image, let's make sure everything is working as you expected in this first step.

Later I'll work on other 2 PRs later:

  • Adapt the workflow and images to fit in one docker image
  • Standardize the project scaffolding, while there is no actual 'standard' for it. I've been working on an article to simplify the spin up of any project. The idea is simple, using Makefile and docker along with docker-compose, any project, regarding the language and framework, has named actions that are custom per project. Just to give an example:
make start   #Inside the makefile it will define how to run the development environment
make deploy #Deploy according to project specifics
make test #Setup and perform test

So the workflow could be standard, let's say, call:

step 1: make test
step 2: make security-check
step 3: make deploy

I got a little deviated, so you can consider this one or not.

Copy link
Member

@mstenta mstenta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again @AlMaVizca! I left two small comments on things I noticed in the diffs, and some more thoughts below... I know I'm asking for a lot of changes - and I appreciate all the time you're contributing to this! These are good improvements, and the Dockerfiles are critical pieces of our development and packaging infrastructure, so it is important to get right. :-)

I'm a proponent of the "don't break HEAD" rule when creating commits. I like each commit to be a self-contained incremental change, which could be merged by itself (in sequence) without breaking things (not always possible, but a good discipline to try to stick to IMO).

With that in mind, can we split up some of the changes in your last two commits so that they can be merged into relevant previous commits?

For example, the commit labelled "Moving PHP configurations to separate files to improve legibility" can include the config file changes being made to the dev image as well, I think. So essentially we have one commit that handles splitting out files for both images. And this way, that commit by itself could in theory be merged without breaking tests.

If we can approach each commit this way, that would be great - and also make it easier for me to review them as incremental improvements. I may even be able to merge some of them into 2.x independently, while others await further review/change requests by other maintainers.

Right now it's a bit hard for me to juggle the latest two commits in my head because they both make changes to the way USER is handled in the image. It seems like the first commit breaks things, and the second commit intends to fix them. If we can rearrange things a bit to make the steps more clear-cut that would be easier to wrap my head around.

Lastly, I think it makes sense to split out phpcs.xml and phpstan.neon, but I'm not sure we should deviate from the current approach of using sed for the phpunit.xml file specifically. Currently we are not maintaining that entire file in our codebase. We are simply copying from the Drupal core phpunit.xml.dist file, and overriding specific bits. If we move to maintaining the whole file we lose the benefit of inheriting changes that may be introduced in upstream Drupal core. So even though it is a bit ugly in our Dockerfile, I think it is better to leave that one for now.

docker/dev/conf.d/opcache-revalidate-freq.ini Outdated Show resolved Hide resolved
docker/dev/files/phpstan.neon Outdated Show resolved Hide resolved
@mstenta
Copy link
Member

mstenta commented Oct 18, 2023

Thanks @AlMaVizca - looking good!

I pulled your branch down locally and made a few small changes myself. Mostly small adjustments and moving some things around (the commits themselves and some changes moved/split out). My goal was to make the commit messages "tell the story" of the changes and make each commit simple and focused. Not much different from what you had originally, so hopefully it makes sense! I left your commit author credit on all of them. Please take a moment to review and let me know if you see anything wrong or confusing.

I moved the three "small" changes to the front, and put the three bigger ones (config files + multi-step + www-data user) last - although the last one now isn't very big at all, from a git diff perspective, which is nice.

The only notable deviation from your commits is that I tried to simplify the changes for the www-data user commit in the dev Dockerfile. So instead of moving things around it is just adding some additional USER commands where needed. I figure it might be worth following up to combine some of our RUN commands more generally, to reduce the number of layers that are created, but that can be separate, and for now this keeps these changes very simple and focused.

Let me know what you think! I'll start the tests running to see if anything fails. I tested building both images locally and they worked.

mstenta
mstenta previously approved these changes Oct 18, 2023
@mstenta
Copy link
Member

mstenta commented Oct 18, 2023

Docker image size differences:

farmos/farmos:2.x:

Before: 1,023,905,894 bytes
After: 1,005,086,774 bytes
Difference: -18,819,120 bytes

farmos/farmos:2.x-dev:

Before: 1,446,640,084 bytes
After: 1,427,806,248 bytes
Difference: -18,833,836 bytes

So we're saving about 18 mb in size, with the multi-step build process (which now only splits out the GEOS build). So not quite as big of an improvement as @AlMaVizca's original PR, but still a bit slimmer. :-)

@mstenta
Copy link
Member

mstenta commented Oct 23, 2023

While we are thinking about this: maybe it would be worth splitting this PR into two new ones:

  1. Create multi-stage Dockerfile
  2. Run Docker container as www-data user

I think all of the commits leading up the last one are ready for final review. It's just this last one that may become a bit more hairy.

@mstenta
Copy link
Member

mstenta commented Oct 23, 2023

Would it be ok to check for something more specific, like /opt/drupal/web/profiles/farm/farm.info.yml?

Another question that popped out is, How clean are the GitHub environments?
I'm asking this because before these changes, the file defined on FARMOS_FS_READY_SENTINEL_FILENAME was never removed and might be there owned by root, therefore failing on the line 23 of the entry point when trying to overwrite it with owner www-data.

Just wanted to respond to this as well...

None of this really matters, because the problem runs deeper. It happens if you try to run the normal (non-dev) container as well, even without FARMOS_FS_READY_SENTINEL_FILENAME:

$ docker run --rm -it -v "./www:/opt/drupal" farmos-docker-improvements
farmOS codebase not detected. Copying from pre-built files in the Docker image.
cp: cannot create directory '/opt/drupal/./.git': Permission denied
cp: cannot create regular file '/opt/drupal/./composer.json': Permission denied
cp: cannot create regular file '/opt/drupal/./composer.lock': Permission denied
cp: cannot create directory '/opt/drupal/./vendor': Permission denied
cp: cannot create regular file '/opt/drupal/./.editorconfig': Permission denied
cp: cannot create regular file '/opt/drupal/./.gitattributes': Permission denied
cp: cannot create directory '/opt/drupal/./web': Permission denied
cp: preserving times for '/opt/drupal/.': Operation not permitted

So this is a core incompatibility with our current docker-entrypoint.sh script logic and running the container as www-data.

@mstenta
Copy link
Member

mstenta commented Oct 23, 2023

Docker compose files are mounting ./www:/opt/drupal. If the path ./www doesn't exist, it will be created being root the owner.

This is the core issue. If the ./www directory does not exist before docker compose up (or docker run) runs, then it will be owned by root, and docker-entrypoint.sh (running as www-data) will not be able to copy files into it.

We could potentially work around this by modifying our deliver.yml to create the www directory and chown www-data:www-data it before we run docker compose up.

But we would also have to update our docs to tell end-users who are setting up their development/production environments to do that manually as well. I think that would work, but it's a bummer because it adds manual steps to the setup (always prefer removing steps to adding them). And we need to consider whether or not this will affect existing installs.

All-in-all, this is feeling like a more complicated change than I originally anticipated. Which means it will need a lot more community review and discussion before we can merge it regardless.

All the more reason to split it out to its own PR, IMO.

@mstenta
Copy link
Member

mstenta commented Oct 23, 2023

To preserve this behavior, the only action that I can think of is to add an empty folder www to the repository

This would only address the CI/CD pipeline failures. But it would not help end-users. They would still need to manually create their bind-mount directory(ies) manually and set the ownership. See the instructions we provide here, for example:

https://farmos.org/hosting/install/#farmos-in-docker

And here: https://farmos.org/hosting/composer/ (which reminds me... that page would need to be updated with this PR too).

@AlMaVizca AlMaVizca changed the title Create multi-stage dockerfile and run the service as www-data user Create multi-stage dockerfile Oct 24, 2023
@AlMaVizca
Copy link
Contributor Author

As a summary, I'm removing the last two commits, and adding a small change to build bcmath inside the php-dependencies stage.

And here: https://farmos.org/hosting/composer/ (which reminds me... that page would need to be updated with this PR too).

Not sure which part are you refering too, in case it's the docker section, I think it is still valid, since it's a reference for extending the image.

@mstenta
Copy link
Member

mstenta commented Oct 24, 2023

As a summary, I'm removing the last two commits, and adding a small change to build bcmath inside the php-dependencies stage.

Thanks @AlMaVizca! I'll take a look soon.

Not sure which part are you refering too, in case it's the docker section, I think it is still valid, since it's a reference for extending the image.

Yes I meant we'll need to remember to test/update that in the www-data PR, because right now it assumes that it is running as root. I'll follow up in the new PR so we remember that.

@mstenta
Copy link
Member

mstenta commented Oct 24, 2023

Tests are passing! I think we can flag this for final review now.

@AlMaVizca
Copy link
Contributor Author

Just a question in case, it will reduce the amount of effort on your side.
I've created the commits for the other PR, splitting into more stages, and removing the dev/Dockerfile.
Do you prefer them apart?
I've created the PR on my repo, so you could give it a brief look.

Copy link
Member

@paul121 paul121 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM! I'm not entirely sure but think there are now a few 3.x changes we need to incorporate into this PR too?

Really really like that configuration is getting split out into separate files. Appreciate all the work on this @AlMaVizca @mstenta !

Comment on lines 42 to 43
# Configure PHPUnit.
RUN cp -p /var/farmOS/web/core/phpunit.xml.dist /var/farmOS/phpunit.xml \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we move the PHPunit config to a separate file in files/phpunit.xml ? I know we're copying this over from the Drupal core phpunit.xml.dist but I wonder how often this changes. We haven't needed to modify our code in a few years.. but I wonder how much that is just luck not having conflicting changes from Drupal core 🤔

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlMaVizca did separate this one out originally and I asked that we keep it the way it was, because separating it out means we are maintaining the whole file. I agree it may not change often, but if it does there's no way for us to know. Currently we get any changes automatically (or the sed command will fail, notifying us that we need to update it). I think that's worth keeping as-is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@paul121 @mstenta you will see on a following PR that, I'v created the script with all the regexp, so the Docker file is cleaner

@mstenta
Copy link
Member

mstenta commented Oct 31, 2023

think there are now a few 3.x changes we need to incorporate into this PR too?

This is true. We'll need to rebase this onto 3.x. I can take a pass at that this morning since I'm familiar with the relevant 2.x->3.x changes...

@mstenta
Copy link
Member

mstenta commented Oct 31, 2023

I've created the commits for the other PR, splitting into more stages, and removing the dev/Dockerfile.
Do you prefer them apart?

Thanks @AlMaVizca! Looking forward to reviewing those too! (Although it might be a little bit before I have a chance.)

Yes let's keep them separate for now, since this PR is already close to being merged.

@mstenta
Copy link
Member

mstenta commented Oct 31, 2023

We'll need to rebase this onto 3.x. I can take a pass at that this morning

Done!

@mstenta mstenta changed the base branch from 2.x to 3.x October 31, 2023 14:04
@mstenta
Copy link
Member

mstenta commented Oct 31, 2023

Not sure why tests aren't running on this anymore (maybe due to base branch change and/or the fact that the PR branch doesn't start with 3.x-*), but I pushed a copy of it to my fork and tests passed FYI: https://github.com/mstenta/farmOS/actions/runs/6707691111

@mstenta mstenta merged commit 0df3970 into farmOS:3.x Nov 6, 2023
8 of 9 checks passed
@mstenta
Copy link
Member

mstenta commented Nov 6, 2023

Merged! Thanks again for all your work (and patience!) with this one @AlMaVizca! Great improvements!

@AlMaVizca AlMaVizca deleted the docker-image-improvements branch November 19, 2023 14:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants