Added Dockerfile #3

Manouchehri · 2017-12-12T21:41:35Z

Had some issues with cmake on Xenial, so switching to bionic (next LTS) seemed to be the route of least resistance.

Manouchehri · 2017-12-13T02:08:00Z

Using Prebuilt Image

docker run -it --rm thawsystems/retdec

Building Docker Image

git clone https://github.com/Manouchehri/retdec
docker build -t retdec .
docker run -it --rm retdec

aidanhs · 2017-12-13T14:31:46Z

How about removing the maintainer (since it'll be maintained by repo contributors if it ends up in the repo) and doing COPY rather than cloning?

Manouchehri · 2017-12-13T17:37:34Z

@aidanhs Hmm, doing COPY gets weird fast since we cannot assume that the submodules have been cloned (otherwise it will breaks most automated Docker build systems).

Could we merge in this now and revisit it once #14 is solved?

cquick97

Pull i386 Ubuntu image instead of x64, since only 32-bit is supported.

FROM i386/ubuntu:bionic

"Supported architectures (32b only): Intel x86, ARM, MIPS, PIC32, and PowerPC"

Manouchehri · 2017-12-13T21:37:36Z

@cquick97 That's for target binaries, not the host platform?

cquick97 · 2017-12-13T21:49:17Z

Sure, but if 64-bit isn't required, why not just stick to 32-bit? I noticed that I couldn't compile a test binary within the container with gcc -m32, as there were architecture dependency problems. Of course you can just compile elsewhere and analyze within the container, but for simple testing I think having i386 dependencies would be better.

I'm halfway through an i386 build right now, so we'll see if that fixes anything. Or maybe I'm just crazy/dumb and you can ignore my nonsense.

Manouchehri · 2017-12-13T22:07:07Z

You can install a 32-bit compiler if needed.

docker run -it --rm --user root thawsystems/retdec
apt-get install gcc-multilib
su - retdec
export PATH=${PATH}:/home/retdec/retdec-install/bin
echo 'void main() { printf("Dave says hello.\n"); }' > test.c
gcc -m32 test.c
decompile.sh a.out

I don't think changing the host is a good idea, going back to 32-bit Linux distributions is a nightmare. Other RE tools like IDA Pro, Binary Ninja, etc are already 64-bit only, I think it's an acceptable requirement.

cquick97 · 2017-12-13T22:09:35Z

That's probably a better idea. Disregard :)

breznak · 2017-12-14T11:06:37Z

👏 for the docker image, instead of full-dependencies installation.
Also, imho the64bit host is more future proof, as the 64bit decompilation support would hopefully be added soon? #9

aidanhs · 2017-12-14T11:14:48Z

@Manouchehri the main thing I wasn't keen on was cloning the whole repo graph twice - maybe COPY and then git submodule update --init --recursive? Still not perfect (because it discards changes people might have made to change the commit of a submodule) but at least you can make changes in this repo and rebuild with those changes, which is one of the use-cases of a Dockerfile.

I'm just making suggestions while waiting for official feedback :)

s3rvac · 2017-12-14T11:15:25Z

@breznak I am not sure if I understood your question correctly, but the future support for decompilations of 64b binaries (#9) will be independent from the OS that is used to build and run RetDec. That is, once there is support for decompilations of 64b files, you will be able to use both 32b/64b RetDec builds.

breznak · 2017-12-14T11:25:09Z

@s3rvac my idea was that on 64 you can run both 64/32bit apps, if you were to develop, debug something in the image. But basically disregard that, as it boils down to

I don't think changing the host is a good idea, going back to 32-bit Linux distributions is a nightmare. Other RE tools like IDA Pro, Binary Ninja, etc are already 64-bit only, I think it's an acceptable requirement.

breznak · 2017-12-14T11:26:49Z

Also, Alpine linux is a very popular docker container OS for its low footprint,
https://hub.docker.com/_/alpine/

It may be more work to set all the things up (once), but then Alpine could be a lighter alternative than a full-blown Ubuntu. What do you think?

Manouchehri · 2017-12-14T17:44:06Z

@breznak While Alpine is lighter, I prefer to use Ubuntu as it's more similar to the platform developers will be using and testing with.

Manouchehri · 2017-12-14T17:48:36Z

@aidanhs You do not need to clone any of the submodules before building the image. See the commands listed in #3 (comment).

aidanhs · 2017-12-14T21:06:23Z

@Manouchehri my point is that if you initialise submodules, you can COPY in the current working directory.

People can use your Dockerfile until they actually want to make and test a change, at which point they need to either edit it to use my proposed solution (COPY in the repo) or push every single thing they want to test to their own repo. I think COPY is a better solution, and following that with RUN git submodule update --init --recursive addresses your comment at #3 (comment).

Part of the benefit of Dockerfiles is to ease development, which is what I'm shooting for.

Manouchehri · 2017-12-14T21:26:00Z

The problem is COPY only works if the submodules are initialized. Breaking expected build behavior isn't a good idea.

My suggestion is to make a separate Dockerfile for intended your development use case. This one is meant for users to get started quickly with using RetDec.

HugoKlepsch · 2017-12-15T14:01:50Z

@Manouchehri I must agree with @aidanhs here. If you clone the repo while building the image, that Dockerfile could never be used to test local changes. That means you can never test to see if the Docker image works without commiting and pushing the change!

I also don't think that running git submodule update --init --recursive is a good idea, because that would forget any changes made to submodule versions.

At my work we also use Docker. Our general flow is to have the Dockerfile build an image from the local repository. This allows us to test local changes. Then when we want to publish a new version of the image, our CI system fully checks out the master branch and builds the new image.

If you are concerned users won't fully clone the repository, could we not add a README.md that explains the image creation process?

HugoKlepsch

Thanks for your contribution! 😊
Some minor feedback...

HugoKlepsch · 2017-12-15T14:07:10Z

Dockerfile

+	mkdir build && \
+	cd build && \
+	cmake .. -DCMAKE_INSTALL_PREFIX=/home/retdec/retdec-install && \
+	make && \


Is there some way we can auto-detect the number of processors available?

It could be done with a little bash, but we should probably try to improve the normal build process instead of diverging and hacking about in a Dockerfile.

I don't know if this should be multi-platform, but on linux you can go
make -j$(nproc)
https://unix.stackexchange.com/questions/208568/how-to-determine-the-maximum-number-to-pass-to-make-j-option

I would also suggest using make -j$(nproc) because otherwise, only a single core will be utilized for build. Or by setting MAKEFLAGS="-j $(nproc)".

-j(if the project is huge, it will spawn sometimes to many jobs) and -j $(nproc) are both good - you need coreutils for that, but I would expect this to be dragged with something like build-essential or already be there.

HugoKlepsch · 2017-12-15T14:07:32Z

Dockerfile

+ENV HOME /home/retdec
+
+RUN apt-get -y update && \
+	DEBIAN_FRONTEND=noninteractive apt-get install -y build-essential git bc graphviz upx cmake python zlib1g-dev flex bison libtinfo-dev autoconf pkg-config m4 libtool wget


I would also use --no-install-recommends

Thanks. I'll eventually split this into a multi-stage so the build tools won't need to be in the production image.

HugoKlepsch · 2017-12-15T14:09:18Z

Dockerfile

+	DEBIAN_FRONTEND=noninteractive apt-get install -y build-essential git bc graphviz upx cmake python zlib1g-dev flex bison libtinfo-dev autoconf pkg-config m4 libtool wget
+
+USER retdec
+RUN git clone --recursive https://github.com/avast-tl/retdec && \


I would prefer to use COPY.
Could it be optional somehow? Maybe two Dockerfiles: one for local changes and one from upstream master?

That's what I suggested.

Since it's not my use case, I'll let someone else create a "development" Dockerfile that they like.

Manouchehri · 2017-12-18T18:13:55Z

@s3rvac Any feedback from the Avast team? =)

s3rvac · 2017-12-19T16:52:37Z

Dockerfile

+	cd retdec && \
+	mkdir build && \
+	cd build && \
+	cmake .. -DCMAKE_INSTALL_PREFIX=/home/retdec/retdec-install && \


Shouldn't we install it into /home/retdec/install? The additional retdec- prefix seems redundant to me. Or is there a motivation behind the current path?

The reason I use retdec-install is because if the end user adds more things to /home/retdec/, it can be a bit ambiguous what an install/ folder is referring to.

Not opposed to doing it either way though, which do you want me to use? =)

aidanhs · 2017-12-19T18:13:18Z

Ok - I get that having two dockerfiles might be useful. But in that case, I'd like to suggest renaming the file in this pr to Dockerfile.master or Dockerfile.prod or something, so when I create a PR for a COPY dockerfile it will be the default (as it's the one that people will most likely want to use if they already have the repo cloned).

On that note, it'd be cool to have some instructions for using this particular Dockerfile - for example, you will (when this PR is merged) be able to just curl -sSL https://raw.githubusercontent.com/avast-tl/retdec/master/Dockerfile.prod | docker build - to get the latest build without cloning.

Manouchehri · 2017-12-19T19:48:00Z

@aidanhs Dockerfiles must be named Dockerfile or otherwise they cannot be built on platforms like Docker Hub..

Once a functional Dockerfile like you're describing is written, we can switch to that. At the moment it's nonexistence, so it makes little sense to make this PR harder to use.

HugoKlepsch · 2017-12-19T19:52:23Z

@Manouchehri In that case then the path it is under must make it obvious it is for publishing only, not development. ie. '/productionDockerfile/Dockerfile' or the like

Manouchehri · 2017-12-19T19:53:31Z

Sure, but why do that if there's currently only one Dockerfile to begin with? If another Dockerfile is written, it should be added to a folder at that point. We're debating folder structure for files that don't exist.

HugoKlepsch · 2017-12-19T19:57:31Z

Honestly if this patch were complete it would have all common use case Dockerfiles. (in our case, production and development)

It's better to not have to change readme's, build scripts etc later when the other common use case Dockerfile is added.

Manouchehri · 2017-12-19T20:01:13Z

@HugoKlepsch Okay, go do that yourself then.

aidanhs · 2017-12-20T00:17:24Z

Dockerfiles must be named Dockerfile or otherwise they cannot be built on platforms like Docker Hub..

FWIW, this isn't true (though I think it used to be the case) - you can see that https://hub.docker.com/r/rustlang/crater/builds/bnftxfvygyjsakpah66cvyh/ uses the docker/Dockerfile.mini file from https://github.com/rust-lang-nursery/crater/tree/b3168a8/docker

HugoKlepsch · 2017-12-20T04:08:05Z

If that is the case, maybe we can have a Dockerfile and a Dockerfile.dev. I'm working on such a development Dockerfile in my off time now.

- MOV, MVN and MOVZ instructions - operand shift functions moved and changed for ARM64 - instructions like 'movz x0, avast#3 LSL 16' work now

- Register parent map - Storing registers - Loading registers - Headers - Need more changes to conversions, I think 'mov w0, avast#3' zeroes out the upper 32bits of x0 register. But need to investigate further.

* Capstone2llvmirtool default basic modes for architectures Run tool with reasonable Capstone basic modes for specified architecture. Default values are as follows: -a arm : CS_MODE_ARM -a arm64 : CS_MODE_ARM [looks like keystone doesn't like this] -a mips : CS_MODE_MIPS32 -a x86 : CS_MODE_32 -a ppc : CS_MODE_32 -a <rest>: CS_MODE_LITTLE_ENDIAN * Base for the ARM64 translator - register maps(_reg2type) - instructions map(_i2fm) Modified ARM Translator unit, Work in progress. * Fix the cs_reg_name - register name could not be found because of the wrong cs_arch in constructor * Add ARM64 support for capstone dependency - capstone was configured without the ARM64 support, this caused cs_open to fail * Temporary solution to call translate function * Status register and program counter added to environment - flags from status register added to arm64 env - program counter added to arm64 env * Methods store/load registers/operands skeletons + add instruction - basic implementation of functions needed for loading and storing operands - translateAdd is for testing purposes * Store instruction base - started implementation of MEM operand type - Store register instruction translation method e.g. retdec-capstone2llvmir -a arm64 -t 'str x0, [x1]' * Operand shifts ported from ARM and MOV instruction tranlation - MOV, MVN and MOVZ instructions - operand shift functions moved and changed for ARM64 - instructions like 'movz x0, #3 LSL 16' work now * Arm64 - tests ported from Arm - test framework capstone2llvmirtranslator - first INS_ADD test - cmake compilation * Basic MOV tests - MOV, MOVZ * Test for STR instruction and test header comments * STP instruction + tests, pc in new enum, get op addr function - Store pair instruction{pre-index, post-index, signed-offset} - test for all cases except 32bit operands - pc moved to its own enum - generateGetOperandAddr to generate address from instruction operand * LDR + STR, LDR tests from ARM, LDP stub - LDR{pre-index, post-index, signed-offset} instruction implemented - STR{pre-index, post-index, signed-offset} instruction implemented - LDR tests ported from ARM - LDP todo * Implemented parent register handling - Register parent map - Storing registers - Loading registers - Headers - Need more changes to conversions, I think 'mov w0, #3' zeroes out the upper 32bits of x0 register. But need to investigate further. * LLVM data layout modified for ARM64 - taken from uname -a in qemu arm64 machine Linux debian-aarch64 4.9.0-4-arm64 #1 SMP Debian 4.9.65-3+deb9u1 (2017-12-23) aarch64 GNU/Linux * Removed useless debug output * getCarryRegister for ARM64 fixed * Store register ZEXT_TRUNC, 32bit tests baseline + tests - when writing value to 32bit reg the 64bit, the value is zero extended to the vhole register - parent register mapping enabled in tests - 32bit version of tests * Zero extension tests for ADD and MOV 32bit variants * Implemented BL instruction - added tests for label and imm branch * Implemented RET instruction - added tests * Implemented LDP instruction - added tests for instruction * Implemeneted ADRP instruction - real binary testing is needed - without tests * enable arm64 in decompiler.py and add arm64 architecture in Architecture::setArch() ARM64 needs to be set before ARM because "arm" from ARM matches the "arm aarch64" from ARM64 * Arm64 ABI implementation * Arm64 decoder ported from Arm * Arm64 imm operand shifts should not update flags by default. - Added the option to switch this behaviour - add one ADD test with shift * Operand register extension generator + 64bit variant extension tests - Arm supports the extension of operand e.g. 'add x0, x1, w2, SXTW' will sign-extend the w2 register to 64 bit and after that add the values - test for 64bit variant implemented - need to check the optional imm(shift VM outputs weird values) * Arm64 Zero/Sign extension 32bit variant tests * Implemented SUB instruction - added tests for instruction * Implemented BR instruction - added tests for instruction * Arm64 syscall id register is X8 * Specified call and return instruction ID for implemented instruction - BL Branch link is hinting the function call - RET is hinting the function return * Fixed compilation after merge - new methods added isOperandRegister, getOperandAccess - loadOpTernaryop1op2 probably changed to loadOpBinaryOrTernaryOp1Op2 - made sure all unit tests passed - TODO: implement new conventions from master * Generate pseudoasm instruction when translation routine is not found - Function to generate condition code * Check preconditions in implemented arm64 instructions * Changed register generation to match other modules. * LDR instruction all 3 formats + tests - register - imm - literal (label) * Binaries can now be decompiled - jumpTargetDryRun updated * Generate condition codes for conditional instructions. * ARM64: strb, strh instructions + tests * Arm64: conditional and unconditional branch instruction + tests - removed the generation of conditional code in translate instruction function, this is not necessary because condition is generated in body of given instruction and arm64 support only specific instruction to be conditional. * Arm64: Instruction ret can have optional register operand + test * Arm64: BLR instruction + test * Arm64: CBNZ, CBZ instruction + test * Arm64: TBNZ, TBZ implementation + tests * Arm64: LDR different size variants, sign/zero extend + tests * Arm64: LDPSW instruction + tests - minor warning fix in STR instruction * Arm64: ADC instruction + tests - including flag setting for ADC and ADD instructions - ADDS tests * Arm64: ADCS 32bit tests for flags * Arm64: ADR, ADRP instruction + tests * Arm64: AND, ANDS instruction + tests * Arm64: ASR instruction + tests - ASRV variant * Arm64: LSL, LSR, ROR instructions + tests - all major shifts implemented * Arm64: SUB, SBC flags + tests - changed asserts to exceptions * Arm64: CMP, CMN instructions + tests * Arm64: CSEL instruction + tests * Arm64: CSET, CSETM instruction + tests * Arm64: MUL instruction + tests * Arm64: MADD instruction + tests - 32bit tests for MUL * Arm64: MSUB instruction + tests * Arm64: MNEG instruction + tests * Arm64: NEG, NEGS instruction + tests * Arm64: NGC, NGCS initial implementation + tests - Check the carry flags + add tests * Arm64: SDIV, UDIV instruction + tests * Arm64: Fix correct semantics for SBC and NEG instructions * Arm64: SMADDL, UMADDL instruction + tests * Arm64: UMSUBL, SMSUBL instruction + tests * Arm64: SMNEG, UMNEG instruction + tests * Arm64: UMULL, SMULL, UMULH, SMULH instruction + tests * Arm64: Conditional select operation instruction + tests * Arm64: CINC, CINV, CNEG tests * Arm64: EON, EOR instruction + tests * Arm64: ORN, ORR instruction + tests * Arm64: TST instruction + tests - fixed the AND instruction to set carry and overflow flags to zero * Arm64: EXTR instruction + tests * Arm64: Extend instructions + tests * Arm64: CCMN, CCMP instruction + tests * Arm64: NOP instruction + tests * Arm64: REV, RBIT, CLZ instructions + tests * Arm64: BIC instruction + tests * Arm64: Unpriviledged loads/stores instructions + tests * Arm64: Load/Store exclusive instructions + tests * ARM64: LDAXR instruction variants + tests * Arm64: LDAR instruction variants + tests * Arm64, llvmir-emul: don't lower bitreverse intrinsic - updated tests to check if the correct intrinsic functions was called * Arm64: FP environment + basic unary and binary operations + tests * Arm64: FMIN, FMINNM, FMAX, FMAXNM instruction + tests * Arm64: FCMP, FCCMP, FCVT, {U, S}CVTF instructions + tests * Arm64: FCVTZS, FCVTZU instructions + tests - let's start testing * Arm64, bin2llvmir: Decoder should not analyse stack. * Arm64: MOVK instruction + tests * Arm64: MOVN instructions + tests * Merge master with arm-prep * Architecture: Change arm architectures to account for arm64 -> isArmOrThumb renamed to isArm32OrThumb -> added isArm32 method -> thumb is now set with a flag _thumbFlag * Architecture: Removed the wrong architecture types Now the enum eArch represents only general architecture and all subtypes of architecture are checked to getBitSize() or _thumbFlag. The function isArm() return true for every type of subarchitecture e.g. {arm32, arm64 or thumb} * Arm64: XZR loads zero and discards result when written - Added some instruction IDs to branch types * Arm64: STR and LDR instructions now determine correct register size - For example 'str w0, [sp]' should store only 4bytes to stack pointer * Arm64: Syscall optimalization and detection Replace svc #0 with corresponding syscall decoded from previous assignments. * Arm64: MOVI instructions + tests, Vector and half register Generate Vector registers so in case the pseudo instructions with them as operands is generated we don't crash. For the similar purpose I changed the f16 in ARM64_REG_H* to i16 since half type in not supported and we wan't to be able to at least generate pseudo instructions. * Arm64: STR and LDR tests Those tests target loading and storing floating point values. * Arm64: Removed zero division semantics from llvmir - Zero division is NOW undefined behaviour - This caused problems in modulo idiom detection - Also removed coresponding tests * Arm64: FMOV instruction with immediate values - Correctly handle imm values as operands of this instruction * Revert "Arm64, bin2llvmir: Decoder should not analyse stack." This reverts commit 7b88475. This change caused other tests to fail. * Arm64: Simplified and documented some code - Removed unused code from decoder/arm64.cpp - Fixed insnWrittesPcArm64 to work better - Fixed Cond branch tests * Arm64: Fixed documentation build

Manouchehri added 2 commits December 13, 2017 12:11

Added Dockerfile.

3346009

Removed maintainer name.

3964a1f

Manouchehri force-pushed the docker branch from 4c96492 to 3964a1f Compare December 13, 2017 17:29

cquick97 suggested changes Dec 13, 2017

View reviewed changes

HugoKlepsch suggested changes Dec 15, 2017

View reviewed changes

jakubkroustek assigned s3rvac Dec 19, 2017

PeterMatula added C-build-system new-feature labels Dec 19, 2017

s3rvac reviewed Dec 19, 2017

View reviewed changes

Added flag to build on all cores. avast#3 (comment)

82f7efc

Manouchehri closed this Dec 19, 2017

HugoKlepsch mentioned this pull request Dec 20, 2017

Add Dockerfiles and instructions for using docker containers #60

Merged

zangobot mentioned this pull request Feb 9, 2018

Download of external projects during build fails with 'Protocol "https" not supported or disabled in libcurl' #214

Closed

kolomparrudi mentioned this pull request Aug 13, 2018

4127 Segmentation fault "$BIN2LLVMIR" "${BIN2LLVMIR_PARAMS[@]}" -o "$OUT_BACKEND_BC" #373

Closed

afcsoft mentioned this pull request Aug 22, 2023

[capstone2llvmir]: unhandled branch instruction format #2 #723

Open

Added Dockerfile #3

Added Dockerfile #3

Conversation

Manouchehri commented Dec 12, 2017

Manouchehri commented Dec 13, 2017

Building Docker Image

aidanhs commented Dec 13, 2017

Manouchehri commented Dec 13, 2017

cquick97 left a comment

Choose a reason for hiding this comment

Manouchehri commented Dec 13, 2017 • edited Loading

cquick97 commented Dec 13, 2017

Manouchehri commented Dec 13, 2017 • edited Loading

cquick97 commented Dec 13, 2017

breznak commented Dec 14, 2017

aidanhs commented Dec 14, 2017

s3rvac commented Dec 14, 2017

breznak commented Dec 14, 2017

breznak commented Dec 14, 2017

Manouchehri commented Dec 14, 2017

Manouchehri commented Dec 14, 2017

aidanhs commented Dec 14, 2017 • edited Loading

Manouchehri commented Dec 14, 2017

HugoKlepsch commented Dec 15, 2017

HugoKlepsch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Manouchehri commented Dec 18, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aidanhs commented Dec 19, 2017

Manouchehri commented Dec 19, 2017

HugoKlepsch commented Dec 19, 2017

Manouchehri commented Dec 19, 2017 • edited Loading

HugoKlepsch commented Dec 19, 2017

Manouchehri commented Dec 19, 2017

aidanhs commented Dec 20, 2017

HugoKlepsch commented Dec 20, 2017

Manouchehri commented Dec 13, 2017 •

edited

Loading

Manouchehri commented Dec 13, 2017 •

edited

Loading

aidanhs commented Dec 14, 2017 •

edited

Loading

Manouchehri commented Dec 19, 2017 •

edited

Loading