Modify

Opened 8 months ago

Last modified 7 weeks ago

#16857 new enhancement

Set up gitlab

Reported by: michael2402 Owned by: team
Priority: normal Milestone:
Component: unspecified Version:
Keywords: hack-weekend-2018-10 Cc:

Description (last modified by wiktorn)

Currently, JOSM uses SVN for core and most plugins. We have a lot of external source included in the source svn.
We also have the problem that Jenkins is running on the main server and does not scale well.

When fixing this, we should migrate to git (especially because of the easier branching support, better tooling and since we need to extract parts of the project any way while preserving history)

The projects will then be

  • Project: JOSM with repository JOSM
  • Project: JOSM-Plugins with one repository for each plugin
  • Project: Dependencies with repository jmapviewer, gettext, ...

The long-term goal is to migrate everything to git and not depend on the OSM SVN server any more.

Attachments (2)

hack-weekend-2018-10-git.png (84.3 KB) - added by simon04 8 months ago.
hack-weekend-2018-10-git.2.png (207.7 KB) - added by simon04 8 months ago.

Download all attachments as: .zip

Change History (46)

comment:1 Changed 8 months ago by stoecker

We also have the problem that Jenkins is running on the main server and does not scale well.

Please explain that? I don't remember anything related to that comment. We have a certain resources assigned and use these. Issues we had usually came from going Jenkins amok and using more then it should have.

We had separate servers for this in the past (I think Vincent provided it?) and I wanted to have it on the main server, so we have better control over an increasingly more essential component of the development workflow. And I'm not so sure if asking our sponsor to provide two servers isn't maybe a bit much.

Install local gitlab server:

Explain that as well. What should be the scope of GitLab. GitLab is much more than git.

comment:2 in reply to:  1 ; Changed 8 months ago by michael2402

Replying to stoecker:

I don't remember anything related to that comment.

I remember Trac getting unresponsive as soon as a build is running... ;-)

We had separate servers for this in the past (I think Vincent provided it?) and I wanted to have it on the main server, so we have better control over an increasingly more essential component of the development workflow. And I'm not so sure if asking our sponsor to provide two servers isn't maybe a bit much.

This is exactly the reason why we want to move to a container approach there. So that the build runners are only runners and can easily be replaced. This would allow us to keep the configuration in a central place and scale them as we need them (run locally, run on a different server if someone sponsors one, ...).

Install local gitlab server:

Explain that as well. What should be the scope of GitLab. GitLab is much more than git.

Gitlab allows to view the commits, to e.g. add a clickable link in commit messages to Trac issues, manage the permissions, ...

We discussed several alternatives and so far a self-hosted gitlab community edition seems to be the best one (we could get gold for free but decided against it because we don't know if it will stay available).

Last edited 8 months ago by michael2402 (previous) (diff)

comment:3 in reply to:  2 ; Changed 8 months ago by stoecker

Replying to michael2402:

Replying to stoecker:

I don't remember anything related to that comment.

I remember Trac getting unresponsive as soon as a build is running... ;-)

Yes. Jenkins feels sometimes free to flood the harddisk with bullshit and assumes that taking 100% memory and also most swap is a good idea. Very bad programming in my eyes (and even after reporting a bug filling the harddisk with GB of data in a few seconds they did not fix it for months now). There are safeguards now to keep it from going amok (at least I hope so).

We had separate servers for this in the past (I think Vincent provided it?) and I wanted to have it on the main server, so we have better control over an increasingly more essential component of the development workflow. And I'm not so sure if asking our sponsor to provide two servers isn't maybe a bit much.

This is exactly the reason why we want to move to a container approach there. So that the build runners are only runners and can easily be replaced. This would allow us to keep the configuration in a central place and scale them as we need them (run locally, run on a different server if someone sponsors one, ...).

That sounds like a good idea. If the runners are easy to deploy and don't affect other processes (i.e. guaranteed restricted memory, space and CPU requirements) I have lots of free server power available (on openSUSE systems usually :-).

Install local gitlab server:

Explain that as well. What should be the scope of GitLab. GitLab is much more than git.

Gitlab allows to view the commits, to e.g. add a clickable link in commit messages to Trac issues, manage the permissions, ...

We discussed several alternatives and so far a self-hosted gitlab community edition seems to be the best one (we could get gold for free but decided against it because we don't know if it will stay available).

In case we switch to git - there will be no hard switch (to many things depend on the current system).

So in your discussions did you construct a plan how to handle the time where SVN and git are running parallel. E.g. these 3 steps in the beginning must be filled with a procedure:

  • 1st step having git as a mirror from SVN to git
  • 2nd step also going the other way, so that git actually gets useful
  • 3rd step find replacements for the SVN dependencies of many infrastructural components (to the simple fact of a version number)
  • ...

Especially are you prepared, when the 2nd step may be the last one because changing the rest is no longer a good idea when started?

comment:4 in reply to:  3 ; Changed 8 months ago by michael2402

Replying to stoecker:

Replying to michael2402:

This is exactly the reason why we want to move to a container approach there. So that the build runners are only runners and can easily be replaced. This would allow us to keep the configuration in a central place and scale them as we need them (run locally, run on a different server if someone sponsors one, ...).

That sounds like a good idea. If the runners are easy to deploy and don't affect other processes (i.e. guaranteed restricted memory, space and CPU requirements) I have lots of free server power available (on openSUSE systems usually :-).

They are available as Docker-container or as deb package, almost no config required (just enter the master gitlab URL and generate a token on the gitlab server ;-))

In case we switch to git - there will be no hard switch (to many things depend on the current system).

So in your discussions did you construct a plan how to handle the time where SVN and git are running parallel. E.g. these 3 steps in the beginning must be filled with a procedure:

  • 1st step having git as a mirror from SVN to git
  • 2nd step also going the other way, so that git actually gets useful
  • 3rd step find replacements for the SVN dependencies of many infrastructural components (to the simple fact of a version number)
  • ...

Especially are you prepared, when the 2nd step may be the last one because changing the rest is no longer a good idea when started?

We are planning to do step 3 first. As soon as we do not depend on any SVN features (svn externals and versioning), we can do the switch to git. There are some other things we want to keep in mind (e.g. git has an option to not download large files on every clone, we want to use this for the jar files that are currently in SVN. Otherwise, a git clone will always download the hundreds of megabytes of outdated libraries that most people won't use).

For versioning, we are planning to mark the tested versions with their real version (18.10, 18.11) from a user perspective and for our update system and keep the commit id only for internal error reporting.

We still have not decided all details of the timeline, this mostly depends on design the build system (although we have settled on gradle there, because it supports hosting all resources on our local nexus instead of putting them the repository and it allows us to easily include our own build tasks written in java).

comment:5 in reply to:  4 Changed 8 months ago by simon04

Replying to stoecker:

That sounds like a good idea. If the runners are easy to deploy and don't affect other processes (i.e. guaranteed restricted memory, space and CPU requirements) I have lots of free server power available (on openSUSE systems usually :-).

GitLab CI runner does not seem to provide openSUSE packages. However the manual Linux installation only involves downloading a pre-compilled binary, setting up a user and installing the system service. Or, alternatively, as michael2402 mentioned, it GitLab Runner can be used from a Docker container.

Replying to michael2402:

There are some other things we want to keep in mind (e.g. git has an option to not download large files on every clone, we want to use this for the jar files that are currently in SVN. Otherwise, a git clone will always download the hundreds of megabytes of outdated libraries that most people won't use).

Git Large File Storage (LFS)

comment:6 in reply to:  4 ; Changed 8 months ago by stoecker

For versioning, we are planning to mark the tested versions with their real version (18.10, 18.11) from a user perspective and for our update system and keep the commit id only for internal error reporting.

That's not a good idea. There is a major reason why the JOSM release versioning is like it: It is the most simply solution and as automatic as possible. Why? Because I'm packager for openSUSE for a long time now and I simply deal with too many projects which aren't able to get a release out. That's why JOSM release is automatic. I remember when Vincent came to the project fresh full of good ideas and wanted to add all these fancy things like announcements in social media and this and that and so many things. I said: Yes, you can do that, but that's all optional. The required minimum will not be increased. Some things from these plans survived. A lot have been dropped inbetween due to the additional work involved.

It should stay this way. ATM JOSM is actively developed, but that can change. I plan for such a situation even when I hope it will not come. The requirements to maintain and resetup the main components must be acceptable. I.E. ATM I can drop everything except SVN and the WIKI (essentially the Apache instance) and JOSM still is releasable.

My proposal would be to replace the SVN revision with a string like "YYYYMMDDX", i.e. 201812030. This can be done automatic, stays compatible with the SVN revision (it's higher!), allows up to 10 builds (i.e. hot-fixes) per release day and is independent from build system. In most cases it will also be possible to guess the milestone from the revision (if we keep it inside the same month).

NOTE: An independent release means that the repository must be tagged, or we loose the exact assignment.

Changed 8 months ago by simon04

Changed 8 months ago by simon04

comment:7 in reply to:  6 ; Changed 8 months ago by michael2402

Replying to stoecker:

For versioning, we are planning to mark the tested versions with their real version (18.10, 18.11) from a user perspective and for our update system and keep the commit id only for internal error reporting.

That's not a good idea. There is a major reason why the JOSM release versioning is like it: It is the most simply solution and as automatic as possible. Why? Because I'm packager for openSUSE for a long time now and I simply deal with too many projects which aren't able to get a release out. That's why JOSM release is automatic. I remember when Vincent came to the project fresh full of good ideas and wanted to add all these fancy things like announcements in social media and this and that and so many things. I said: Yes, you can do that, but that's all optional. The required minimum will not be increased. Some things from these plans survived. A lot have been dropped inbetween due to the additional work involved.

It should stay this way. ATM JOSM is actively developed, but that can change. I plan for such a situation even when I hope it will not come. The requirements to maintain and resetup the main components must be acceptable. I.E. ATM I can drop everything except SVN and the WIKI (essentially the Apache instance) and JOSM still is releasable.

My proposal would be to replace the SVN revision with a string like "YYYYMMDDX", i.e. 201812030. This can be done automatic, stays compatible with the SVN revision (it's higher!), allows up to 10 builds (i.e. hot-fixes) per release day and is independent from build system. In most cases it will also be possible to guess the milestone from the revision (if we keep it inside the same month).

NOTE: An independent release means that the repository must be tagged, or we loose the exact assignment.

The problem with generating this string is that:

  • In git, commits are not sorted by their time
  • The build time cannot be reproduced
  • Counting the number of commits would be possible but different and would still not account for parallel development.

Our goal is to have reproducible builds. So even if the josm server fails, as long as you have the backup of the git repository, you can re-compile all tested versions. We don't care about snapshot versions that much (we don't need an exact version match there).

See #16870 for discussions about versioning.

comment:8 in reply to:  7 Changed 8 months ago by stoecker

NOTE: An independent release means that the repository must be tagged, or we loose the exact assignment.

Our goal is to have reproducible builds. So even if the josm server fails, as long as you have the backup of the git repository, you can re-compile all tested versions. We don't care about snapshot versions that much (we don't need an exact version match there).

That's why the note above. If there is an independent version, then the build time MUST be tagged, otherwise it is not reproducible whatever system you think of. That means you add at least 30 tags each month automatically.

comment:9 in reply to:  1 ; Changed 8 months ago by Don-vip

Replying to stoecker:

I don't remember anything related to that comment. We have a certain resources assigned and use these. Issues we had usually came from going Jenkins amok and using more then it should have.

The website and SVN server is unavailable every day for ~20 minutes when Jenkins runs the Spotbugs analysis for plugins. We experienced it during the hack week-end. I spent a lot of time to tune the configuration to minimize the problem but can't find a way to get rid of it. That's why I suggested to move the CI to a new server, to make sure the website is always available for Trac and JOSM downloads.

We had separate servers for this in the past (I think Vincent provided it?) and I wanted to have it on the main server, so we have better control over an increasingly more essential component of the development workflow. And I'm not so sure if asking our sponsor to provide two servers isn't maybe a bit much.

Yes I was paying a small VM at OVH. I stopped it because I didn't want to pay anymore :) We can also have an additional sponsor: I recently asked to RedHat if they would accept us on their Resource Grants program. I'm still waiting for an answer. On paper it looks great, we just have to make sure it can be permanent (How long can I stay enrolled in the program? The account credits are initially applied for 6 months, but extensions, discounts, and custom options can be negotiated based on your participation and need.).

Replying to stoecker:

That sounds like a good idea. If the runners are easy to deploy and don't affect other processes (i.e. guaranteed restricted memory, space and CPU requirements) I have lots of free server power available (on openSUSE systems usually :-).

Good to know!

comment:10 Changed 8 months ago by Don-vip

I'm working on #16231 right now, before looking at setting up GitLab. So we will have recent versions of packages before adding new stuff.

comment:11 Changed 8 months ago by stoecker

Before starting on real server we shut do a shutdown and make a snapshot in case something goes wrong :-)

Wrong ticket

Last edited 8 months ago by stoecker (previous) (diff)

comment:12 in reply to:  9 Changed 8 months ago by wiktorn

Replying to stoecker:

That sounds like a good idea. If the runners are easy to deploy and don't affect other processes (i.e. guaranteed restricted memory, space and CPU requirements) I have lots of free server power available (on openSUSE systems usually :-).

Good to know!

I'll also submit my offer here. I can offer 3 cores and up to 8GB ram for our CI. It's my home server so it may have some downtime, but I run some services for Polish community so try to minimize them.

comment:13 Changed 8 months ago by mkoniecz

git has an option to not download large files on every clone, we want to use this for the jar files that are currently in SVN. Otherwise, a git clone will always download the hundreds of megabytes of outdated libraries that most people won't use

I am not sure is it a good idea to propose history rewrite - but is it really necessary to keep libraries in one repo with "real" JOSM?

comment:14 in reply to:  13 Changed 8 months ago by michael2402

Replying to mkoniecz:

git has an option to not download large files on every clone, we want to use this for the jar files that are currently in SVN. Otherwise, a git clone will always download the hundreds of megabytes of outdated libraries that most people won't use

I am not sure is it a good idea to propose history rewrite - but is it really necessary to keep libraries in one repo with "real" JOSM?

We are not doing a history rewrite. We migrate from SVN to GIT.

comment:15 Changed 8 months ago by mkoniecz

Yes, but note that in Git cloning full repository history is a standard way of getting repository. It may be necessary to rewrite history if it has "hundreds of megabytes of outdated libraries".

comment:16 Changed 8 months ago by Don-vip

Can't we use git lfs also for large deleted files?

comment:17 Changed 8 months ago by stoecker

So maybe SVN is the better tool?

If we migrate, we wont strip the history. If that would be necessary, git is simply the wrong tool.

JOSM svn on disk has 1.1GB ATM.

In git case we should copy plugins and the other parts of josm from OSM svn except the dist dir to a second git. I assume that's probably also about the same size.

comment:18 in reply to:  16 Changed 8 months ago by michael2402

Replying to Don-vip:

Can't we use git lfs also for large deleted files?

I think we should only use it for deleted files (or to-be-deleted files).

This is the top 100 list of big files, totaling in size of more than 500MB

295f7ebeb75d   15MiB data_nodist/projection/OSTN02_NTv2.gsb    <- Keep this one
caa1e425eff8   11MiB test/lib/wiremock-standalone-2.18.0.jar
042a051f8f72   11MiB test/lib/wiremock-standalone-2.15.0.jar
4548122506cc   11MiB test/lib/wiremock-standalone-2.13.0.jar
422aa196c87f   11MiB test/lib/wiremock-standalone-2.10.1.jar
8b56b66952e6   11MiB test/lib/wiremock-standalone-2.7.1.jar
86972aaff051   10MiB tools/checkstyle/checkstyle-all.jar
a69bf35c891e   10MiB tools/checkstyle/checkstyle-all.jar
4b6d68435c56   10MiB tools/error_prone_ant.jar
a95f81efb1a4   10MiB tools/checkstyle/checkstyle-all.jar
20744cc1af7b   10MiB tools/checkstyle/checkstyle-all.jar
499a60abb45f   10MiB tools/checkstyle/checkstyle-all.jar
3484f646eb1e   10MiB tools/checkstyle/checkstyle-all.jar
9238129f389c  9,6MiB tools/error_prone_ant-2.0.17.jar
7b24bdcdef98  8,7MiB tools/error_prone_ant-2.0.19.jar
68cdc7591ac3  8,7MiB tools/error_prone_ant-2.0.18.jar
77c7a0d9c748  8,5MiB tools/error_prone_ant.jar
80cf45d0376a  8,0MiB tools/error_prone_ant.jar
f663deccbd9b  7,8MiB tools/error_prone_ant.jar
b4a04dcc3c3e  7,7MiB tools/error_prone_ant.jar
efb36fae6cf1  7,7MiB tools/error_prone_ant.jar
40236844722f  7,6MiB tools/error_prone_ant.jar
272934e0148d  7,6MiB test/data/wms/webatlas.no.xml                              <- This one will/should be kept
71ddb0498599  7,6MiB data_nodist/projection/projection-reference-data           <- Not so much of a problem
d2ce78281745  7,6MiB data_nodist/projection/projection-reference-data              Diffs are actually pretty small
d9f5026852f7  7,6MiB data_nodist/projection/projection-reference-data
54a0cb7d5e90  7,5MiB tools/error_prone_core.jar
e09fc8f8d021  7,5MiB tools/error_prone_core.jar
abe1fe312669  7,5MiB tools/error_prone_core.jar
cdb9dd584f8f  7,3MiB tools/error_prone_ant-2.0.15.jar
11019c0d841d  7,2MiB tools/error_prone_ant-2.0.14.jar
b7400b8b7cf4  7,0MiB tools/error_prone_ant-2.0.13.jar
c807ae8b6f78  7,0MiB tools/error_prone_ant-2.0.12.jar
9491c1ef2f8f  7,0MiB tools/groovy-all-2.3.8.jar
21d3c236f34a  7,0MiB tools/error_prone_ant-2.0.11.jar
64cdfc9e1bc8  7,0MiB tools/error_prone_core.jar
b6b186ade132  7,0MiB tools/groovy-all-2.3.9.jar
495be3eb25f5  6,9MiB tools/groovy-all-2.3.6.jar
11497a36356a  6,9MiB tools/groovy-all-2.3.7.jar
a321522d84b5  6,9MiB tools/groovy-all-2.3.4.jar
6ee565bb9c41  6,9MiB tools/groovy-all-2.3.2.jar
9568c034db95  6,9MiB tools/error_prone_ant-2.0.9.jar
3c4f8f357d17  6,8MiB tools/groovy-all.jar
7039fb98ca6d  6,8MiB tools/groovy-all.jar
b5eff4fcb954  6,7MiB tools/groovy-all.jar
b255074d7661  6,7MiB tools/groovy-all.jar
ff2e4245f39b  6,7MiB tools/groovy-all-2.4.11.jar
430138428f9c  6,7MiB tools/groovy-all-2.4.6.jar
77369064c118  6,7MiB tools/groovy-all-2.4.8.jar
1d92accc40bd  6,7MiB tools/groovy-all-2.4.7.jar
afeeac29d0f0  6,7MiB tools/groovy-all-2.4.6-SNAPSHOT.jar
0f5412639b7a  6,7MiB tools/groovy-all-2.4.4.jar
07835093c08a  6,7MiB tools/groovy-all-2.4.5.jar
0c3900b08868  6,7MiB tools/error_prone_core.jar
9b264b6f676f  6,6MiB tools/groovy-all-2.4.3.jar
7afd20af9b4f  6,6MiB tools/groovy-all-2.4.0.jar
168633db1328  6,6MiB tools/error_prone_javac.jar
b22e4bf707c4  6,3MiB tools/groovy-all-2.2.2.jar
512c0b4f4075  5,3MiB tools/checkstyle/checkstyle-all.jar
0fa7aa17c96f  5,3MiB tools/checkstyle/checkstyle-6.16-all.jar
6d00e3db35ff  5,3MiB tools/checkstyle/checkstyle-6.16.1-all.jar
ad0ddf54d36d  5,3MiB tools/checkstyle/checkstyle-6.15-all.jar
b1294163c1d5  5,3MiB tools/checkstyle/checkstyle-6.14.1-all.jar
df564e899fde  5,3MiB tools/checkstyle/checkstyle-6.14-all.jar
86440a23d231  5,2MiB tools/checkstyle/checkstyle-6.13-all.jar
f18e11b17c4b  5,2MiB tools/checkstyle/checkstyle-all.jar
899caf375af9  5,2MiB tools/checkstyle/checkstyle-6.12.1-all.jar
db03f8240bc9  5,2MiB tools/checkstyle/checkstyle-6.11.2-all.jar
ed70b7c6e4af  5,2MiB tools/checkstyle/checkstyle-6.11-all.jar
06600b948f5a  5,2MiB tools/checkstyle/checkstyle-6.10.1-all.jar
73762dc681fb  5,1MiB tools/groovy/groovy-2.5.1.jar
75fc3f3e5282  5,1MiB tools/groovy/groovy-2.5.1.jar
9fbd925db622  5,1MiB tools/checkstyle/checkstyle-6.9-all.jar
4ec27f06f4e6  5,1MiB tools/groovy/groovy-2.5.0.jar
d1d6ab027978  5,0MiB tools/checkstyle/checkstyle-all.jar
7c08c41cae17  5,0MiB tools/checkstyle/checkstyle-all.jar
3f8bd7cd566b  5,0MiB tools/checkstyle/checkstyle-all.jar
63ffbb554fc4  5,0MiB tools/checkstyle/checkstyle-all.jar
80ea12cbdb03  5,0MiB tools/checkstyle/checkstyle-all.jar
edad856694d2  4,9MiB tools/checkstyle/checkstyle-7.3-all.jar
fc64ac4dbced  4,9MiB tools/checkstyle/checkstyle-7.2-all.jar
8fa646d979fd  4,9MiB tools/checkstyle/checkstyle-7.1.2-all.jar
66ff23cf7d9d  4,9MiB tools/checkstyle/checkstyle-7.1.1-all.jar
ea05bd2ff1d7  4,9MiB tools/checkstyle/checkstyle-7.1-all.jar
df4187784a83  4,9MiB tools/checkstyle/checkstyle-7.0-all.jar
690bd2d9b392  4,9MiB tools/checkstyle/checkstyle-6.19-all.jar
9d3549290f66  4,9MiB tools/checkstyle/checkstyle-6.17-all.jar
faeedde343c0  4,9MiB tools/checkstyle/checkstyle-6.18-all.jar
de236e50a220  4,8MiB tools/pmd/saxon-9.1.0.8.jar
7e65c018cadd  4,6MiB tools/checkstyle/checkstyle-6.8.1-all.jar
ce3528d29266  4,6MiB tools/checkstyle/checkstyle-6.7-all.jar
1ee2bd3eacbf  4,5MiB data_nodist/projection/projection-reference-data
b5cc9e4dd60b  4,5MiB data_nodist/projection/projection-reference-data
89ef1b46448b  4,5MiB data_nodist/projection/projection-reference-data
8f2a03d1b96f  4,5MiB data_nodist/projection/projection-reference-data
ef8a11ab3e80  4,5MiB data_nodist/projection/projection-reference-data
5be0b75def71  4,5MiB data_nodist/projection/projection-reference-data
7475253cbe51  4,5MiB data_nodist/projection/projection-reference-data
c5f1e04b44a4  4,5MiB data_nodist/projection/projection-reference-data
0b3cc9980eb6  4,5MiB data_nodist/projection/projection-reference-data

So most of the big files are jar files we do not really need any more and we can life with the fact that JOSM cannot be built completely offline in an old version (as it is the case with SVN at the moment any way).

The sum of the size of all JAR files is currently 958.065.271 Bytes in the git mirror. A few kilobytes are the plugin test files, but the remaining files can safely be externalized using git LFS in my opinion.

This is what I used to get the list:
git rev-list --objects --all | git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' | sed -n 's/^blob //p' | sort --numeric-sort --key=2 --reverse | cut -c 1-12,41- | grep "\\.jar"

Last edited 8 months ago by michael2402 (previous) (diff)

comment:19 in reply to:  17 Changed 8 months ago by michael2402

Replying to stoecker:

In git case we should copy plugins and the other parts of josm from OSM svn except the dist dir to a second git. I assume that's probably also about the same size.

We wanted to do a separate git for each plugin.

If you really need all the plugins (for core dev / ...) they can easily be checked out using a small script. But most contributors will only be working on one or two plugins.

Plugins share a common build script in the form of a dependency then and do not rely on a specific directory layout. You can even build / develop them without having to checkout JOSM (since JOSM is just an other dependency of the plugin then and will be fetched from the nexus automatically by the IDE / build script)

comment:20 Changed 8 months ago by stoecker

We wanted to do a separate git for each plugin.

Why? All these plugins are more or less abandoned by their original authors and are managed by JOSM team. Splitting them makes that management much more complex. I18N update and rebuilding them sounds like a nightmare to me when I have to care for more than 100 repositories instead of more than hundred directories. And a waste of space on my harddisk when all the dependencies are a hundred times on my disk, as each repository has them duplicated.

Hmm, the more you explain the "git project" the less I understand the advantages for JOSM. Could someone please present the advantages for JOSM - no general advantages, but these which affect JOSM and would improve the situation we have. Where are the shortcommings you want to fix? And also state the disadvantages.

And don't come with guesses like "it would attract more developers". Present facts only.

All the descriptions above actually show that you want to introduce a lot of new problems, some of them major ones (e.g. git lfs is an ugly workaround for a general design fault of git.)

For me "git" ATM is still a "we want it because we want it" idea. We may go this way because of this reason, but actually I'd like to have a little more.

comment:21 in reply to:  20 Changed 8 months ago by Don-vip

Replying to stoecker:

Why? All these plugins are more or less abandoned by their original authors and are managed by JOSM team. Splitting them makes that management much more complex. I18N update and rebuilding them sounds like a nightmare to me when I have to care for more than 100 repositories instead of more than hundred directories. And a waste of space on my harddisk when all the dependencies are a hundred times on my disk, as each repository has them duplicated.

This is already a nightmare for me with all these GitHub plugins. I never found the time to write scripts to make life easier, but this is clearly needed as of today. Once we have those scripts to make life easier again, there is no problem to have a git repo per plugin.

One big advantage of having separate repos is that we can control them better. On the current OSM SVN, several beginners have already completely corrupted the repo because they wanted to update/add a single plugin and didn't understand what they were doing with SVN. Also this can be very problematic with GSoC students less talented than others. For example, next year I'd like to give commit access to Jo' next student to the pt_assistant repo only. This way I won't have to fix the potential mistakes :)

comment:22 in reply to:  20 ; Changed 8 months ago by michael2402

Replying to stoecker:

Why? All these plugins are more or less abandoned by their original authors and are managed by JOSM team. Splitting them makes that management much more complex. I18N update and rebuilding them sounds like a nightmare to me when I have to care for more than 100 repositories instead of more than hundred directories.

Looping through them is no problem, we talked about that.

And a waste of space on my harddisk when all the dependencies are a hundred times on my disk, as each repository has them duplicated.

That's where the move to gradle comes in.

Hmm, the more you explain the "git project" the less I understand the advantages for JOSM. Could someone please present the advantages for JOSM - no general advantages, but these which affect JOSM and would improve the situation we have. Where are the shortcommings you want to fix? And also state the disadvantages.

Some things that we currently face:

  • We want to drop some plugins. For a git repo, we can simply stop developing in that repository.
  • We have a plugins build of ~4h that runs every time we do a change on a single plugin. It would be nice to track which plugins have changed and only do the integration build e.g. once a day. => Faster feedback if a commit broke something
  • When a build fails, we want the author that pushed that change to be notified
  • We have several active plugins that are not on the main plugin repository but instead hosted somewhere else. They set up their own CI and other infrasturcture becasue JOSM can't provide it. We want to get them back to JOSM and have a central place of repositories and a common plugin setup.
  • We should give more fine-graned access to plugins and also CI notifications to plugin authors only. In e.g. gitlab, the person that pushed the commits is automatically detected and gets the message about failing builds.
  • We have several hundred patches in Trac that we copy+pase around.

And my personal problems with it:

  • I have made several patches in the past, some of them are unmerged. I cannot really tell which parts of the patches got merged and which not and have to look up the tickes for each of them, because we have no history about that in SVN. I have some in my local git repos, I have some in Trac tickets and I probably lost some. It makes developing more than one feature at a time a real pain.
  • e.g. during the hack weekend, I noticed many short comings of SVN where I just had to switch to the GIT repo (e.g. do a quick @git grep@).
  • Tooling support for SVN is pretty limited. Eclipse does not support it out of the box, code does not (which I like most for merging, because it is fast)
  • I was traveling this spring. Originally I wanted to spend some time on JOSM, but switched to an other project because I had parts with several days of no internet, I would not be able to commit them
  • I just hate to commit without being able to stage. You always have to be so careful. '--amend' is a common phrase in my bash history.
  • I'm annoyed by the 'JOSM does not build any more' mails but do not want to disable them since it's sometimes me breaking the stuff.
  • EDIT: During GSoC, I really missed that we do not have a 'merge request review' feature for the patches. Instead, I had to tell the students to do a fake merge request against the JOSM repo or against their own one at github so that I could review their changes. Or I had to reference line numbers, which is pretty confusing (especially since there are at multiple ways to number lines in a patch)

git lfs is an ugly workaround for a general design fault of git.

No, it's a workaround for a design fault in our build system. The source code repository is meant for source code, and only for source code. It is not meant for distributing files and it is not meant to be an artifact repository.

Last edited 8 months ago by michael2402 (previous) (diff)

comment:23 Changed 8 months ago by mkoniecz

No, it's a workaround for a design fault in our build system. The source code repository is meant for source code, and only for source code. It is not meant for distributing files and it is not meant to be an artifact repository.

Maybe splitting repo in two "real repository with source code" and "history of all releases ever made with a backup of dependencies" would help?

Last edited 8 months ago by mkoniecz (previous) (diff)

comment:24 in reply to:  15 Changed 8 months ago by wiktorn

Replying to mkoniecz:

Yes, but note that in Git cloning full repository history is a standard way of getting repository. It may be necessary to rewrite history if it has "hundreds of megabytes of outdated libraries".

Replying to mkoniecz:

No, it's a workaround for a design fault in our build system. The source code repository is meant for source code, and only for source code. It is not meant for distributing files and it is not meant to be an artifact repository.

Maybe splitting repo in two "real repository with source code" and "history of all releases ever made with a backup of dependencies" would help?

Mateusz,

I'm not sure if I understand what's your point.

The plan, as we discussed it so far (but that wasn't mentioned earlier) is to import SVN to a new GIT repository. This means, that we will rewrite the history in https://github.com/openstreetmap/josm. Though it has nothing to do with the fact that we will move jar files to GIT LFS.

Why we want to do this? Because we want to keep the history from both - svn.openstreetmap.org and josm.openstreetmap.de SVN repositories and GitHub has only history from josm.

Why we want to go ahead with GIT LFS:

  • only to be able to build historical version of JOSM using GIT repo

Together with move to GIT we want to move our dependencies out of repository, so there will be no need in future for this second repo.

I hope that this has shed some light on the issue that you're rising (though I feel, I don't know what issue you're trying to rise).

comment:25 in reply to:  10 ; Changed 8 months ago by Don-vip

Replying to Don-vip:

I'm working on #16231 right now, before looking at setting up GitLab.

Done. I started to look at Gitlab requirements:

  • 2 cores is the recommended number of cores
  • 8GB RAM is the recommended memory size for all installations

This is way too much for our small server. It must be installed on another server (TBD).

comment:26 in reply to:  22 Changed 8 months ago by stoecker

Replying to michael2402:

  • We want to drop some plugins. For a git repo, we can simply stop developing in that repository.

Same is true for a directory.

  • We have a plugins build of ~4h that runs every time we do a change on a single plugin. It would be nice to track which plugins have changed and only do the integration build e.g. once a day. => Faster feedback if a commit broke something

Should also be possible with a directory.

  • When a build fails, we want the author that pushed that change to be notified

Should also be possible with a directory.

  • We have several active plugins that are not on the main plugin repository but instead hosted somewhere else. They set up their own CI and other infrasturcture becasue JOSM can't provide it. We want to get them back to JOSM and have a central place of repositories and a common plugin setup.

Actually Why? Why should JOSM support people who want to setup their own stuff. I never considered that a useful approach. Either go your own way or not. Mixed approaches make only a lot of work for maintainers, but have no advantage.

  • We should give more fine-graned access to plugins and also CI notifications to plugin authors only. In e.g. gitlab, the person that pushed the commits is automatically detected and gets the message about failing builds.

A valid point.

  • We have several hundred patches in Trac that we copy+pase around.

Hmm, nearly all of them can be considered dead. They are mainly only there because nobody closes the tickets assuming that they may have still some use or someone will continue.

And my personal problems with it:

  • I have made several patches in the past, some of them are unmerged. I cannot really tell which parts of the patches got merged and which not and have to look up the tickes for each of them, because we have no history about that in SVN. I have some in my local git repos, I have some in Trac tickets and I probably lost some. It makes developing more than one feature at a time a real pain.

Same argument as above. When you don't care about them enough to check anything remaining is dead and uninteresting.

  • e.g. during the hack weekend, I noticed many short comings of SVN where I just had to switch to the GIT repo (e.g. do a quick @git grep@).

That's personal style. Grepping svn is easy as well, I use that as a major part of work.

  • Tooling support for SVN is pretty limited. Eclipse does not support it out of the box, code does not (which I like most for merging, because it is fast)

Well, I use eclipse with SVN when I must use eclipse. I don't remember that setup was complicated.

  • I was traveling this spring. Originally I wanted to spend some time on JOSM, but switched to an other project because I had parts with several days of no internet, I would not be able to commit them
  • I just hate to commit without being able to stage. You always have to be so careful. '--amend' is a common phrase in my bash history.

That's style again. You simply like git workflow more.

  • I'm annoyed by the 'JOSM does not build any more' mails but do not want to disable them since it's sometimes me breaking the stuff.

Which would not change for core authors. A broken build is always relevant for a core member even if not caused by him.

  • EDIT: During GSoC, I really missed that we do not have a 'merge request review' feature for the patches. Instead, I had to tell the students to do a fake merge request against the JOSM repo or against their own one at github so that I could review their changes. Or I had to reference line numbers, which is pretty confusing (especially since there are at multiple ways to number lines in a patch)

Review process is a valid argument. Don't know about gitlab, but the github stuff has some advantages (and disadvantages).

git lfs is an ugly workaround for a general design fault of git.

No, it's a workaround for a design fault in our build system. The source code repository is meant for source code, and only for source code. It is not meant for distributing files and it is not meant to be an artifact repository.

Well, if you read the GPL again, than you will find that it requires that everything required to build beside the operating system and generic tools must be supplied together with the source. Our build system is not faulty, it simply fulfills required license conditions (and if we interpret the license very harsh probably not even that).

comment:27 in reply to:  25 Changed 8 months ago by stoecker

Replying to Don-vip:

Replying to Don-vip:

I'm working on #16231 right now, before looking at setting up GitLab.

Done. I started to look at Gitlab requirements:

  • 2 cores is the recommended number of cores
  • 8GB RAM is the recommended memory size for all installations

This is way too much for our small server. It must be installed on another server (TBD).

We could ask for the last update step with next renewal: CX60: 8 vCores, 32 GB, 600 GB

Thought the CX series of hetzner will end soon, they moved to a new plattform for virtuals. The new cloud series is of no use for us I think: https://www.hetzner.de/cloud (not enough space).

Or we ask for an real server: https://www.hetzner.de/dedicated-rootserver/matrix-ex

An EX51 or EX51-SSD would be a good choice. Probably the EX51 variant, as we don't need that fast HD (it's a must for my mapserver thought). The EX51 is BTW cheaper than the CX60. :-)

comment:28 Changed 8 months ago by stoecker

For reference:

  • EX51-SSD: 8 (virtual) cores, 64GB RAM, 500GB SSD (with RAID) or 1GB SSD (without RAID)
  • EX51: 8 (virtual) cores, 64GB RAM, 4TB HD (with RAID) or 8TB HD (without RAID)
  • we have: 4 virtual cores, 16GB RAM, 400GB SSD (half full)

comment:29 Changed 8 months ago by stoecker

A server upgrade could also solve the jenkins issues we have without changing the system :-) Comments?

comment:30 Changed 8 months ago by Don-vip

Well if they are OK to provide us an EX51 for free, let's ask for it :) When is the next renewal due?

comment:31 Changed 8 months ago by stoecker

Ask for HD or SSD or SSD without RAID?

I'm operating one EX51-SSD without RAID and that's fine. More space, but you loose safety in case of a hardware failure.

Ah, we anyway need to renew. Server is payed till 1.12.2018. Seems 3 months are gone, because I remember February as date. Probably they want to get rid of the CX sponsoring anyway :-)

comment:32 in reply to:  31 Changed 8 months ago by stoecker

I'm operating one EX51-SSD without RAID and that's fine. More space, but you loose safety in case of a hardware failure.

OTOH: I had one HDD hardware failure in many years and many servers with Hetzner and after they plugged the new HDD I copied the config of the empty one to the valid one instead the other way round and afterwards had to resetup system anyway.

comment:33 Changed 8 months ago by Don-vip

I think the 1TB SSD without RAID is good for us. You're backuping data everyday, right?

comment:34 in reply to:  33 Changed 8 months ago by stoecker

Replying to Don-vip:

I think the 1TB SSD without RAID is good for us. You're backuping data everyday, right?

Note: 1TB is 900GiB to be clear.

Yes, I have a daily backup running. That catches all essentials for resetup. Backup recovery is thus essentially identical to a reinstall from scratch with data and config copyable from the backup. Volatile data like jenkins builds or changes outside data directories (mainly etc, var, home) are lost. A server move could be used to test a backup reinstall - the last one is some time ago...

Last edited 8 months ago by stoecker (previous) (diff)

comment:35 Changed 8 months ago by michael2402

If we re-setup everything, wouldn't it be good to put all services that run on that server in containers? One container for gitlab, one for the CI worker, one for track. That way, we can restrict the resources to each of them more easily and it won't be much work to move one service to an other server.

About gitlab resources: I have gitlab running on a Server with 8GB RAM. In reality it needs much less than 2GB of RAM (that's the Limit I have set). But I'm using several gigabyte to buffer the disk contents in RAM, which provides very fast access times. So we can either opt for an SSD or for a lot of RAM (I personally would prefer the SSD)

comment:36 Changed 8 months ago by wiktorn

Description: modified (diff)

I have semi-final version of migration script from svn to git. I think, that once we will decide to migrate JOSM we should have at least 2 modules - jmapviewer and JOSM at this moment, so we can:

  • have subset of jmapviewer (as defined by our svn:externals) within JOSM tree
  • full history in jmapviewer module

comment:37 in reply to:  33 Changed 7 months ago by stoecker

Replying to Don-vip:

I think the 1TB SSD without RAID is good for us. You're backuping data everyday, right?

Hetzner would sponsor the existing CX50 for another year, but not an EX51.

Either we get somebody else to pay for it or we should try to find another provider.

comment:38 Changed 5 months ago by simon04

Frederik, Dirk and Vincent submitted a grant application to FOSSGIS concerning a new server of type Hetzner EX51-SSD: https://www.fossgis.de/wiki/F%C3%B6rderantr%C3%A4ge/JOSM-Server

Can anyone estimate whether the grant will be accepted and when the server will be available?

comment:39 Changed 5 months ago by simon04

Btw, more minimalistic self-hosted alternatives to GitHub, Gitlab include

Both are written in Go and claim to run on a Raspberry Pi. See https://docs.gitea.io/en-us/comparison/ for a feature comparison among GitHub, Gitlab, Gogs, Gitea.

Assuming that we stick to Jenkins for CI, Gitea seems to be a viable alternative for our needs.

comment:40 Changed 5 months ago by michael2402

Some of the reasons we wanted to switch to git(lab) were:

  • The ability to review pull requests
  • Better integration of the CI with the repository (=> only people that committed get mails, CI for branches)
  • Easy to set up

comment:41 Changed 4 months ago by bagage

Is the new server now available? Did the gitlab migration started?

comment:42 in reply to:  41 Changed 4 months ago by Don-vip

Replying to bagage:

Is the new server now available?

Yes, see #17231

Did the gitlab migration started?

Not yet I plan to begin this week.

comment:43 Changed 4 months ago by anonymous

If you ever switch to git, you can use git submodules as well. Basically, it's a soft link to an external (git) repository. I just recently discovered this feature, you might don't know it exists.

Using submodules you can create an own repo just for images for example, yet without breaking the current building scripts. It can be an intermediate step to fully separating JOSM modules.

https://git-scm.com/docs/git-submodule

comment:44 Changed 7 weeks ago by Don-vip

I finally managed to setup Gitlab:
https://josm.openstreetmap.de/gitlab/

The instance is currently empty and sign-up is disabled. I will try to play a little bit with tracboat now.

Modify Ticket

Change Properties
Set your email in Preferences
Action
as new The owner will remain team.
as The resolution will be set.
to The owner will be changed from team to the specified user.
The owner will change to michael2402
as duplicate The resolution will be set to duplicate.The specified ticket will be cross-referenced with this ticket
The owner will be changed from team to anonymous.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.