Conflicts
Questions
- What do I do when my changes conflict with someone else’s?
Objectives
- Explain what conflicts are and when they can occur
- Explain how git behaves when it can’t merge your changes
- Resolve conflicts from a merge
We’ve shown an example of the sorts of changes which Git can merge automatically - where the changes are in separate parts of the file. But, as soon as people can work in parallel, they’ll likely step on each other’s toes.
This will even happen with a single person: if we are working on a piece of software on both our laptop and a server in the lab, we could make different changes to each copy. Version control helps us manage these conflicts by giving us tools to resolve overlapping changes.
To see how we can resolve conflicts, we must first create one.
The file mean.py
should currently look like this on the main
branch of both partners’ copies of our mean
repository:
cat mean.py
import pandas as pd
= pd.read_csv("rgb.csv")
dataframe
= dataframe["blue"]
blues print(blue.means())
Let’s let Alice make a change to her file: she decides that she wants to set the column for the mean calculation in a constant at the top of her script:
nano mean.py
cat mean.py
import pandas as pd
= "blue"
COLUMN = pd.read_csv("rgb.csv")
dataframe
= dataframe[COLUMN]
subset print(subset.means())
and then push the change to GitHub:
git add mean.py
git commit -m "Make the subset column a constant"
[main cde8d2e] Make the subset column a constant
(+), 2 deletions(-) 1 file changed, 3 insertions
git push origin main
: 17, done.
Enumerating objects: 100% (17/17), done.
Counting objects
Delta compression using up to 12 threads: 100% (15/15), done.
Compressing objects: 100% (15/15), 1.44 KiB | 1.44 MiB/s, done.
Writing objects(delta 8), reused 0 (delta 0), pack-reused 0
Total 15 : Resolving deltas: 100% (8/8), completed with 1 local object.
remote:alice/mean.git
To github.com dea7c3c..cde8d2e main -> main
Now let’s have Bob make a change to their copy without updating from GitHub:
nano mean.py
cat mean.py
We’ll make a change which overlaps with Alice’s changes - Bob’s had the same idea, but has given his constant a different name, and has changed its value back to “red”:
import pandas as pd
="red"
COLOUR= pd.read_csv("rgb.csv")
dataframe
= dataframe[COLOUR]
coloured print(coloured.means())
We can commit the change locally:
git add mean.py
git commit -m "Parametrised the column to subset"
[main 7720ff0] Parametrised the column to subset
(+), 1 deletion(-) 1 file changed, 5 insertions
When Bob tries to push his changes to GitHub, Git complains and refuses:
git push origin main
:alice/mean.git
To github.com[rejected] main -> main (fetch first)
! : failed to push some refs to 'github.com:spikelynch/mean.git'
error: Updates were rejected because the remote contains work that you do
hint: not have locally. This is usually caused by another repository pushing
hint: to the same ref. You may want to first integrate the remote changes
hint: (e.g., 'git pull ...') before pushing again.
hint: See the 'Note about fast-forwards' in 'git push --help' for details. hint
Git rejects the push because it detects that the remote repository has new updates that have not been incorporated into the local branch. What we have to do is pull the changes from GitHub.
Now we come to some exciting, or exasperating, developments: Git is an actively developed piece of open-source software. Over the last couple of years, there have been some changes to the way Git handles the situation we’re about to trigger, and depending on when you installed Git, we might get some different sorts of behaviour at the command line.
I’ll have to go into a little bit of detail to explain what’s happening, even though the net result will be the same.
What we’re going to use is the git pull
command. This asks Git to do two things: fetch the HEAD of the branch we’re interested in from a remote repository, giving us a local copy of the latest version from the remote. We then want Git to merge
that set of changes into our current branch.
We’ve set up a situation, however, where Bob and Alice’s version of the file are inconsistent, in a way which merge
is unable to resolve.
Let’s see what happens, and I’ll see what messages we’re getting, and step through the explanations.
git pull origin main
If Bob has a recent version of Git (2.33 or newer), pulling from Alice’s remote should result in a long message like the following, ending in a fatal error:
: Enumerating objects: 20, done.
remote: Counting objects: 100% (20/20), done.
remote: Compressing objects: 100% (9/9), done.
remote: Total 18 (delta 8), reused 18 (delta 8), pack-reused 0
remote: 100% (18/18), 1.62 KiB | 110.00 KiB/s, done.
Unpacking objects:alice/mean
From github.com
* branch main -> FETCH_HEAD
2408b26..cde8d2e main -> origin/main: You have divergent branches and need to specify how to reconcile them.
hint: You can do so by running one of the following commands sometime before
hint: your next pull:
hint:
hint: git config pull.rebase false # merge
hint: git config pull.rebase true # rebase
hint: git config pull.ff only # fast-forward only
hint:
hint: You can replace "git config" with "git config --global" to set a default
hint: preference for all repositories. You can also pass --rebase, --no-rebase,
hint: or --ff-only on the command line to override the configured default per
hint: invocation.
hint: Need to specify how to reconcile divergent branches. fatal
Can everyone who’s got the fatal error raise their hand and we’ll see how we’re doing? Thanks.
What’s happened here is that Git is refusing to try to reconcile the two branches, without being told what to do by default.
To get our heads around this, we can go back to the directed acyclic graphs.
So far, we’ve looked at reconciling two branches by merging them. You can think of this as Git’s attempt to make a version which contains all of the commits from both branches.
The other way to reconcile two divergent branches is a separate tool, git rebase
. Rebasing tells Git that I want to fetch the latest version of this branch from the remote, then take all of my local commits - all of the changes I’ve made since our branches diverged - and apply them to the latest HEAD of the branch. In a sense, it’s like rewriting history - we fast-forward the repository to get everyone else’s work, and then apply our own.
Rebasing can be complicated and counterintuitive, which is why we don’t have time to cover it in more detail here - it’s become common practice in large teams because it can effectively give the trunk of a repo a linear history, but it’s not really recommended for beginners.
The reason for the change in the behaviour of git pull
is that increasingly, rebasing is becoming more popular as a way of incorporating the latest updates from the team into your work. For small teams or solo developers, however, merging is fine.
We want to tell Git to not rebase, but to try to merge.
If you’re one of the people who got the fatal error, try running git pull
again, with the –no-rebase command line flag:
git pull --no-rebase origin main
:alice/mean
From github.com
* branch main -> FETCH_HEAD
Auto-merging mean.py(content): Merge conflict in mean.py
CONFLICT Automatic merge failed; fix conflicts and then commit the result.
Now, if there are any people who didn’t get the fatal error the first time, let’s see if this looks familiar:
: Enumerating objects: 5, done.
remote: Counting objects: 100% (5/5), done.
remote: Compressing objects: 100% (1/1), done.
remote: Total 3 (delta 2), reused 3 (delta 2), pack-reused 0
remote: 100% (3/3), done.
Unpacking objects://github.com/alice/mean
From https
* branch main -> FETCH_HEAD
29aba7c..dabb4c8 main -> origin/main
Auto-merging mean.py(content): Merge conflict in mean.py
CONFLICT Automatic merge failed; fix conflicts and then commit the result.
This is how git pull
behaved before version 2.27, when it used to merge by default and not warn you about it. If this happened to you, git has fetched Alice’s copy and tried, and failed, to merge it with the local changes.
The third possibility is for people with a copy of Git with a version between 2.27 and 2.33: this gives the long set of warnings about needing to specify a reconciliation strategy, but defaults to merge
anyway, and should look something like this:
: Enumerating objects: 20, done.
remote: Counting objects: 100% (20/20), done.
remote: Compressing objects: 100% (9/9), done.
remote: Total 18 (delta 8), reused 18 (delta 8), pack-reused 0
remote: 100% (18/18), 1.62 KiB | 110.00 KiB/s, done.
Unpacking objects:alice/mean
From github.com
* branch main -> FETCH_HEAD
2408b26..cde8d2e main -> origin/main: You have divergent branches and need to specify how to reconcile them.
hint: You can do so by running one of the following commands sometime before
hint: your next pull:
hint:
hint: git config pull.rebase false # merge (the default strategy)
hint: git config pull.rebase true # rebase
hint: git config pull.ff only # fast-forward only
hint:
hint: You can replace "git config" with "git config --global" to set a default
hint: preference for all repositories. You can also pass --rebase, --no-rebase,
hint: or --ff-only on the command line to override the configured default per
hint: invocation.
hint
Auto-merging mean.py(content): Merge conflict in mean.py
CONFLICT Automatic merge failed; fix conflicts and then commit the result.
You can check which version of Git you have installed by running this command:
git --version
(Apple Git-136) git version 2.37.0
Let’s just go around the room and check that everyone has a big CAPS-LOCK message saying CONFLICT (usually you don’t look for conflict in this kind of workshop, but it’s git.)
So now that we’re all on the same page:
- Git has tried to merge changes from the remote branch.
- Git has detected that changes made to the local copy overlap with those made to the remote repository.
- Therefore, git refuses to merge the two versions to stop us from trampling on our previous work.
The conflict is marked in in the affected file:
cat mean.py
import pandas as pd
<<<<<<< HEAD"red"
COLOUR=("rgb.csv")
dataframe = pd.read_csv
[COLOUR]
coloured = dataframe(coloured.means())
print
======="blue"
COLUMN = ("rgb.csv")
dataframe = pd.read_csv
[COLUMN]
subset = dataframe(subset.means())
print
>>>>>>> cde8d2ea9799a6ccbcfafabd311913cd3f70df17()) ```subset.means
Our change is preceded by <<<<<<< HEAD
. Git has then inserted =======
as a separator between the conflicting changes and marked the end of the content downloaded from GitHub with >>>>>>>
. (The string of letters and digits after that marker identifies the commit we’ve just downloaded.)
It is now up to us to edit this file to remove these markers and reconcile the changes. We can do anything we want: - Keep the change made in the local repository - Keep the change made in the remote repository - Write something new to replace both - Get rid of the change entirely.
Let’s replace both so that the file looks like this - we’ll keep Alice’s constant name COLUMN
but change the default value back to red.
cat mean.py
import pandas as pd
="red"
COLUMN= pd.read_csv("rgb.csv")
dataframe
= dataframe[COLUMN]
subset print(subset.means())
To finish merging, we let git know that we’ve resolved the conflict by using git add
.
This always feels like an anticlimax to me - when I’ve resolved a conflict I feel like I should be able to say git resolve
or git peace-out
or something - but from git’s point of view, this resolution is just another set of changes to add to the staging area:
git add mean.py
git status
On branch main
All conflicts fixed but you are still merging.(use "git commit" to conclude merge)
:
Changes to be committed
: mean.py modified
git commit -m "Merge changes from GitHub"
[main 2abf2b1] Merge changes from GitHub
Now Bob can push his changes to GitHub:
git push origin main
: 10, done.
Enumerating objects: 100% (10/10), done.
Counting objects
Delta compression using up to 12 threads: 100% (6/6), done.
Compressing objects: 100% (6/6), 709 bytes | 709.00 KiB/s, done.
Writing objects(delta 2), reused 0 (delta 0), pack-reused 0
Total 6 : Resolving deltas: 100% (2/2), completed with 2 local objects.
remote:alice/mean.git
To github.com cde8d2e..19cb059 main -> main
Git keeps track of what we’ve merged with what, so we don’t have to fix things by hand again when Alice pulls again:
git pull origin main
: Enumerating objects: 10, done.
remote: Counting objects: 100% (10/10), done.
remote: Compressing objects: 100% (4/4), done.
remote: Total 6 (delta 2), reused 6 (delta 2), pack-reused 0
remote: 100% (6/6), 689 bytes | 137.00 KiB/s, done.
Unpacking objects:spikelynch/mean
From github.com
* branch main -> FETCH_HEAD
cde8d2e..19cb059 main -> origin/main
Updating cde8d2e..19cb059
Fast-forward| 4 ++--
mean.py (+), 2 deletions(-) 1 file changed, 2 insertions
We get the merged file:
cat mean.py
import pandas as pd
="red"
COLUMN= pd.read_csv("rgb.csv")
dataframe
= dataframe[COLUMN]
subset print(subset.means())
We don’t need to merge again because Git knows someone has already done that.
Key Points
- Conflicts occur when two or more people change the same lines of the same file.
- Git can emit a lot of warning messages when this happens
- There have been recent changes to how Git behaves when you pull updates from a remote
- The version control system does not allow people to overwrite each other’s changes blindly, but highlights conflicts so that they can be resolved.
All materials copyright Sydney Informatics Hub, University of Sydney