Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dfsan] Wrap glibc 2.38 __isoc23_* functions #79958

Merged

Conversation

MaskRay
Copy link
Member

@MaskRay MaskRay commented Jan 30, 2024

Fix #79283: test/dfsan/custom.cpp has undefined symbol linker errors
on glibc 2.38 due to lack of wrappers for __isoc23_strtol and
__isoc23_scanf family functions.

Implement these wrappers as aliases to existing wrappers, similar to
https://reviews.llvm.org/D158943 for other sanitizers.

strtol in a user program, whether or not _ISOC2X_SOURCE is defined,
uses the C23 semantics (strtol("0b1", 0, 0) => 1), when
libclang_rt.dfsan.a is built on glibc 2.38+.

Created using spr 1.3.4
@llvmbot
Copy link
Member

llvmbot commented Jan 30, 2024

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Fangrui Song (MaskRay)

Changes

Fix #79283: test/dfsan/custom.cpp has undefined symbol linker errors
on glibc 2.38 due to lack of wrappers for __isoc23_strtol and
__isoc23_scanf family functions.

Implement these wrappers as aliases to existing wrappers, similar to
https://reviews.llvm.org/D158943 for other sanitizers.

strtol in a user program, whether or not _ISOC2X_SOURCE is defined,
uses the C23 semantics (strtol("0b1", 0, 0) => 1), when
libclang_rt.dfsan.a is built on glibc 2.38+.


Full diff: https://github.com/llvm/llvm-project/pull/79958.diff

3 Files Affected:

  • (modified) compiler-rt/lib/dfsan/dfsan_custom.cpp (+47-154)
  • (modified) compiler-rt/lib/dfsan/done_abilist.txt (+6)
  • (modified) compiler-rt/lib/dfsan/libc_ubuntu1404_abilist.txt (+5)
diff --git a/compiler-rt/lib/dfsan/dfsan_custom.cpp b/compiler-rt/lib/dfsan/dfsan_custom.cpp
index 85b796bd6349c..db55958922a6e 100644
--- a/compiler-rt/lib/dfsan/dfsan_custom.cpp
+++ b/compiler-rt/lib/dfsan/dfsan_custom.cpp
@@ -55,6 +55,10 @@ using namespace __dfsan;
 #define DECLARE_WEAK_INTERCEPTOR_HOOK(f, ...) \
 SANITIZER_INTERFACE_ATTRIBUTE SANITIZER_WEAK_ATTRIBUTE void f(__VA_ARGS__);
 
+#define WRAPPER_ALIAS(fun, real)                                          \
+  SANITIZER_INTERFACE_ATTRIBUTE void __dfsw_##fun() ALIAS(__dfsw_##real); \
+  SANITIZER_INTERFACE_ATTRIBUTE void __dfso_##fun() ALIAS(__dfso_##real);
+
 // Async-safe, non-reentrant spin lock.
 class SignalSpinLocker {
  public:
@@ -1197,16 +1201,20 @@ char *__dfso_strcpy(char *dest, const char *src, dfsan_label dst_label,
   *ret_origin = dst_origin;
   return ret;
 }
+}
 
-static long int dfsan_strtol(const char *nptr, char **endptr, int base,
-                             char **tmp_endptr) {
+template <typename Fn>
+static ALWAYS_INLINE auto dfsan_strtol_impl(
+    Fn real, const char *nptr, char **endptr, int base,
+    char **tmp_endptr) -> decltype(real(nullptr, nullptr, 0)) {
   assert(tmp_endptr);
-  long int ret = strtol(nptr, tmp_endptr, base);
+  auto ret = real(nptr, tmp_endptr, base);
   if (endptr)
     *endptr = *tmp_endptr;
   return ret;
 }
 
+extern "C" {
 static void dfsan_strtolong_label(const char *nptr, const char *tmp_endptr,
                                   dfsan_label base_label,
                                   dfsan_label *ret_label) {
@@ -1236,30 +1244,6 @@ static void dfsan_strtolong_origin(const char *nptr, const char *tmp_endptr,
   }
 }
 
-SANITIZER_INTERFACE_ATTRIBUTE
-long int __dfsw_strtol(const char *nptr, char **endptr, int base,
-                       dfsan_label nptr_label, dfsan_label endptr_label,
-                       dfsan_label base_label, dfsan_label *ret_label) {
-  char *tmp_endptr;
-  long int ret = dfsan_strtol(nptr, endptr, base, &tmp_endptr);
-  dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label);
-  return ret;
-}
-
-SANITIZER_INTERFACE_ATTRIBUTE
-long int __dfso_strtol(const char *nptr, char **endptr, int base,
-                       dfsan_label nptr_label, dfsan_label endptr_label,
-                       dfsan_label base_label, dfsan_label *ret_label,
-                       dfsan_origin nptr_origin, dfsan_origin endptr_origin,
-                       dfsan_origin base_origin, dfsan_origin *ret_origin) {
-  char *tmp_endptr;
-  long int ret = dfsan_strtol(nptr, endptr, base, &tmp_endptr);
-  dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label);
-  dfsan_strtolong_origin(nptr, tmp_endptr, base_label, ret_label, base_origin,
-                         ret_origin);
-  return ret;
-}
-
 static double dfsan_strtod(const char *nptr, char **endptr, char **tmp_endptr) {
   assert(tmp_endptr);
   double ret = strtod(nptr, tmp_endptr);
@@ -1307,108 +1291,40 @@ double __dfso_strtod(const char *nptr, char **endptr, dfsan_label nptr_label,
   return ret;
 }
 
-static long long int dfsan_strtoll(const char *nptr, char **endptr, int base,
-                                   char **tmp_endptr) {
-  assert(tmp_endptr);
-  long long int ret = strtoll(nptr, tmp_endptr, base);
-  if (endptr)
-    *endptr = *tmp_endptr;
-  return ret;
-}
-
-SANITIZER_INTERFACE_ATTRIBUTE
-long long int __dfsw_strtoll(const char *nptr, char **endptr, int base,
-                             dfsan_label nptr_label, dfsan_label endptr_label,
-                             dfsan_label base_label, dfsan_label *ret_label) {
-  char *tmp_endptr;
-  long long int ret = dfsan_strtoll(nptr, endptr, base, &tmp_endptr);
-  dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label);
-  return ret;
-}
-
-SANITIZER_INTERFACE_ATTRIBUTE
-long long int __dfso_strtoll(const char *nptr, char **endptr, int base,
-                             dfsan_label nptr_label, dfsan_label endptr_label,
-                             dfsan_label base_label, dfsan_label *ret_label,
-                             dfsan_origin nptr_origin,
-                             dfsan_origin endptr_origin,
-                             dfsan_origin base_origin,
-                             dfsan_origin *ret_origin) {
-  char *tmp_endptr;
-  long long int ret = dfsan_strtoll(nptr, endptr, base, &tmp_endptr);
-  dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label);
-  dfsan_strtolong_origin(nptr, tmp_endptr, base_label, ret_label, base_origin,
-                         ret_origin);
-  return ret;
-}
-
-static unsigned long int dfsan_strtoul(const char *nptr, char **endptr,
-                                       int base, char **tmp_endptr) {
-  assert(tmp_endptr);
-  unsigned long int ret = strtoul(nptr, tmp_endptr, base);
-  if (endptr)
-    *endptr = *tmp_endptr;
-  return ret;
-}
-
-SANITIZER_INTERFACE_ATTRIBUTE
-unsigned long int __dfsw_strtoul(const char *nptr, char **endptr, int base,
-                       dfsan_label nptr_label, dfsan_label endptr_label,
-                       dfsan_label base_label, dfsan_label *ret_label) {
-  char *tmp_endptr;
-  unsigned long int ret = dfsan_strtoul(nptr, endptr, base, &tmp_endptr);
-  dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label);
-  return ret;
-}
-
-SANITIZER_INTERFACE_ATTRIBUTE
-unsigned long int __dfso_strtoul(
-    const char *nptr, char **endptr, int base, dfsan_label nptr_label,
-    dfsan_label endptr_label, dfsan_label base_label, dfsan_label *ret_label,
-    dfsan_origin nptr_origin, dfsan_origin endptr_origin,
-    dfsan_origin base_origin, dfsan_origin *ret_origin) {
-  char *tmp_endptr;
-  unsigned long int ret = dfsan_strtoul(nptr, endptr, base, &tmp_endptr);
-  dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label);
-  dfsan_strtolong_origin(nptr, tmp_endptr, base_label, ret_label, base_origin,
-                         ret_origin);
-  return ret;
-}
-
-static long long unsigned int dfsan_strtoull(const char *nptr, char **endptr,
-                                             int base, char **tmp_endptr) {
-  assert(tmp_endptr);
-  long long unsigned int ret = strtoull(nptr, tmp_endptr, base);
-  if (endptr)
-    *endptr = *tmp_endptr;
-  return ret;
-}
-
-SANITIZER_INTERFACE_ATTRIBUTE
-long long unsigned int __dfsw_strtoull(const char *nptr, char **endptr,
-                                       int base, dfsan_label nptr_label,
-                                       dfsan_label endptr_label,
-                                       dfsan_label base_label,
-                                       dfsan_label *ret_label) {
-  char *tmp_endptr;
-  long long unsigned int ret = dfsan_strtoull(nptr, endptr, base, &tmp_endptr);
-  dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label);
-  return ret;
-}
-
-SANITIZER_INTERFACE_ATTRIBUTE
-long long unsigned int __dfso_strtoull(
-    const char *nptr, char **endptr, int base, dfsan_label nptr_label,
-    dfsan_label endptr_label, dfsan_label base_label, dfsan_label *ret_label,
-    dfsan_origin nptr_origin, dfsan_origin endptr_origin,
-    dfsan_origin base_origin, dfsan_origin *ret_origin) {
-  char *tmp_endptr;
-  long long unsigned int ret = dfsan_strtoull(nptr, endptr, base, &tmp_endptr);
-  dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label);
-  dfsan_strtolong_origin(nptr, tmp_endptr, base_label, ret_label, base_origin,
-                         ret_origin);
-  return ret;
-}
+WRAPPER_ALIAS(__isoc23_strtod, strtod)
+
+#define WRAPPER_STRTO(ret_type, fun)                                     \
+  SANITIZER_INTERFACE_ATTRIBUTE ret_type __dfsw_##fun(                   \
+      const char *nptr, char **endptr, int base, dfsan_label nptr_label, \
+      dfsan_label endptr_label, dfsan_label base_label,                  \
+      dfsan_label *ret_label) {                                          \
+    char *tmp_endptr;                                                    \
+    auto ret = dfsan_strtol_impl(fun, nptr, endptr, base, &tmp_endptr);  \
+    dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label);      \
+    return ret;                                                          \
+  }                                                                      \
+  SANITIZER_INTERFACE_ATTRIBUTE ret_type __dfso_##fun(                   \
+      const char *nptr, char **endptr, int base, dfsan_label nptr_label, \
+      dfsan_label endptr_label, dfsan_label base_label,                  \
+      dfsan_label *ret_label, dfsan_origin nptr_origin,                  \
+      dfsan_origin endptr_origin, dfsan_origin base_origin,              \
+      dfsan_origin *ret_origin) {                                        \
+    char *tmp_endptr;                                                    \
+    auto ret = dfsan_strtol_impl(fun, nptr, endptr, base, &tmp_endptr);  \
+    dfsan_strtolong_label(nptr, tmp_endptr, base_label, ret_label);      \
+    dfsan_strtolong_origin(nptr, tmp_endptr, base_label, ret_label,      \
+                           base_origin, ret_origin);                     \
+    return ret;                                                          \
+  }
+
+WRAPPER_STRTO(long, strtol)
+WRAPPER_STRTO(long long, strtoll)
+WRAPPER_STRTO(unsigned long, strtoul)
+WRAPPER_STRTO(unsigned long long, strtoull)
+WRAPPER_ALIAS(__isoc23_strtol, strtol)
+WRAPPER_ALIAS(__isoc23_strtoll, strtoll)
+WRAPPER_ALIAS(__isoc23_strtoul, strtoul)
+WRAPPER_ALIAS(__isoc23_strtoull, strtoull)
 
 SANITIZER_INTERFACE_ATTRIBUTE
 time_t __dfsw_time(time_t *t, dfsan_label t_label, dfsan_label *ret_label) {
@@ -2866,31 +2782,8 @@ int __dfso_sscanf(char *str, const char *format, dfsan_label str_label,
   return ret;
 }
 
-SANITIZER_INTERFACE_ATTRIBUTE
-int __dfsw___isoc99_sscanf(char *str, const char *format, dfsan_label str_label,
-                           dfsan_label format_label, dfsan_label *va_labels,
-                           dfsan_label *ret_label, ...) {
-  va_list ap;
-  va_start(ap, ret_label);
-  int ret = scan_buffer(str, ~0ul, format, va_labels, ret_label, nullptr,
-                        nullptr, ap);
-  va_end(ap);
-  return ret;
-}
-
-SANITIZER_INTERFACE_ATTRIBUTE
-int __dfso___isoc99_sscanf(char *str, const char *format, dfsan_label str_label,
-                           dfsan_label format_label, dfsan_label *va_labels,
-                           dfsan_label *ret_label, dfsan_origin str_origin,
-                           dfsan_origin format_origin, dfsan_origin *va_origins,
-                           dfsan_origin *ret_origin, ...) {
-  va_list ap;
-  va_start(ap, ret_origin);
-  int ret = scan_buffer(str, ~0ul, format, va_labels, ret_label, &str_origin,
-                        ret_origin, ap);
-  va_end(ap);
-  return ret;
-}
+WRAPPER_ALIAS(__isoc99_sscanf, sscanf)
+WRAPPER_ALIAS(__isoc23_sscanf, sscanf)
 
 static void BeforeFork() {
   StackDepotLockBeforeFork();
diff --git a/compiler-rt/lib/dfsan/done_abilist.txt b/compiler-rt/lib/dfsan/done_abilist.txt
index c582584d77e45..86a42ee1b4dce 100644
--- a/compiler-rt/lib/dfsan/done_abilist.txt
+++ b/compiler-rt/lib/dfsan/done_abilist.txt
@@ -270,6 +270,11 @@ fun:strtoul=custom
 fun:strtoull=custom
 fun:strcat=custom
 fun:strncat=custom
+fun:__isoc23_strtod=custom
+fun:__isoc23_strtol=custom
+fun:__isoc23_strtoll=custom
+fun:__isoc23_strtoul=custom
+fun:__isoc23_strtoull=custom
 
 # Functions that produce an output that is computed from the input, but is not
 # necessarily data dependent.
@@ -311,6 +316,7 @@ fun:snprintf=custom
 # scanf-like
 fun:sscanf=custom
 fun:__isoc99_sscanf=custom
+fun:__isoc23_sscanf=custom
 
 # TODO: custom
 fun:asprintf=discard
diff --git a/compiler-rt/lib/dfsan/libc_ubuntu1404_abilist.txt b/compiler-rt/lib/dfsan/libc_ubuntu1404_abilist.txt
index 433092e2b27b8..9ffa56a238185 100644
--- a/compiler-rt/lib/dfsan/libc_ubuntu1404_abilist.txt
+++ b/compiler-rt/lib/dfsan/libc_ubuntu1404_abilist.txt
@@ -1,3 +1,8 @@
+fun:__isoc23_sscanf=uninstrumented
+fun:__isoc23_strtol=uninstrumented
+fun:__isoc23_strtoll=uninstrumented
+fun:__isoc23_strtoul=uninstrumented
+fun:__isoc23_strtoull=uninstrumented
 fun:_Exit=uninstrumented
 fun:_IO_adjust_column=uninstrumented
 fun:_IO_adjust_wcolumn=uninstrumented

Copy link

github-actions bot commented Jan 30, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff a12390e6202eaf1e5d7810109756e10079b9144b bbb7ed85fe6ac8879417e9484ab95f3ccaf96a3f -- compiler-rt/lib/dfsan/dfsan_custom.cpp
View the diff from clang-format here.
diff --git a/compiler-rt/lib/dfsan/dfsan_custom.cpp b/compiler-rt/lib/dfsan/dfsan_custom.cpp
index 3af26e9f64..14173b2b96 100644
--- a/compiler-rt/lib/dfsan/dfsan_custom.cpp
+++ b/compiler-rt/lib/dfsan/dfsan_custom.cpp
@@ -1204,9 +1204,10 @@ char *__dfso_strcpy(char *dest, const char *src, dfsan_label dst_label,
 }
 
 template <typename Fn>
-static ALWAYS_INLINE auto dfsan_strtol_impl(
-    Fn real, const char *nptr, char **endptr, int base,
-    char **tmp_endptr) -> decltype(real(nullptr, nullptr, 0)) {
+static ALWAYS_INLINE auto dfsan_strtol_impl(Fn real, const char *nptr,
+                                            char **endptr, int base,
+                                            char **tmp_endptr)
+    -> decltype(real(nullptr, nullptr, 0)) {
   assert(tmp_endptr);
   auto ret = real(nptr, tmp_endptr, base);
   if (endptr)

Created using spr 1.3.4
Copy link
Member

@mgorny mgorny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot! I can confirm that all tests pass with this patch, for me. I also love the simplification!

@MaskRay MaskRay changed the title [dfsan] Support glibc 2.38 __isoc23_* functions [dfsan] Wrap glibc 2.38 __isoc23_* functions Jan 30, 2024
@MaskRay MaskRay merged commit 6485600 into main Jan 30, 2024
3 of 4 checks passed
@MaskRay MaskRay deleted the users/MaskRay/spr/dfsan-support-glibc-238-__isoc23_-functions branch January 30, 2024 21:58
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Jan 30, 2024
Fix llvm#79283: `test/dfsan/custom.cpp` has undefined symbol linker errors
on glibc 2.38 due to lack of wrappers for `__isoc23_strtol` and
`__isoc23_scanf` family functions.

Implement these wrappers as aliases to existing wrappers, similar to
https://reviews.llvm.org/D158943 for other sanitizers.

`strtol` in a user program, whether or not `_ISOC2X_SOURCE` is defined,
uses the C23 semantics (`strtol("0b1", 0, 0)` => 1), when
`libclang_rt.dfsan.a` is built on glibc 2.38+.

(cherry picked from commit 6485600)
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Feb 9, 2024
Fix llvm#79283: `test/dfsan/custom.cpp` has undefined symbol linker errors
on glibc 2.38 due to lack of wrappers for `__isoc23_strtol` and
`__isoc23_scanf` family functions.

Implement these wrappers as aliases to existing wrappers, similar to
https://reviews.llvm.org/D158943 for other sanitizers.

`strtol` in a user program, whether or not `_ISOC2X_SOURCE` is defined,
uses the C23 semantics (`strtol("0b1", 0, 0)` => 1), when
`libclang_rt.dfsan.a` is built on glibc 2.38+.

(cherry picked from commit 6485600)
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
Fix llvm#79283: `test/dfsan/custom.cpp` has undefined symbol linker errors
on glibc 2.38 due to lack of wrappers for `__isoc23_strtol` and
`__isoc23_scanf` family functions.

Implement these wrappers as aliases to existing wrappers, similar to
https://reviews.llvm.org/D158943 for other sanitizers.

`strtol` in a user program, whether or not `_ISOC2X_SOURCE` is defined,
uses the C23 semantics (`strtol("0b1", 0, 0)` => 1), when
`libclang_rt.dfsan.a` is built on glibc 2.38+.

(cherry picked from commit 6485600)
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
Fix llvm#79283: `test/dfsan/custom.cpp` has undefined symbol linker errors
on glibc 2.38 due to lack of wrappers for `__isoc23_strtol` and
`__isoc23_scanf` family functions.

Implement these wrappers as aliases to existing wrappers, similar to
https://reviews.llvm.org/D158943 for other sanitizers.

`strtol` in a user program, whether or not `_ISOC2X_SOURCE` is defined,
uses the C23 semantics (`strtol("0b1", 0, 0)` => 1), when
`libclang_rt.dfsan.a` is built on glibc 2.38+.

(cherry picked from commit 6485600)
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
Fix llvm#79283: `test/dfsan/custom.cpp` has undefined symbol linker errors
on glibc 2.38 due to lack of wrappers for `__isoc23_strtol` and
`__isoc23_scanf` family functions.

Implement these wrappers as aliases to existing wrappers, similar to
https://reviews.llvm.org/D158943 for other sanitizers.

`strtol` in a user program, whether or not `_ISOC2X_SOURCE` is defined,
uses the C23 semantics (`strtol("0b1", 0, 0)` => 1), when
`libclang_rt.dfsan.a` is built on glibc 2.38+.

(cherry picked from commit 6485600)
@pointhex pointhex mentioned this pull request May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DataFlowSanitizer-x86_64 :: custom.cpp: undefined reference to `__isoc23_strtol.dfsan' (and more)
4 participants